Scalable Entity Resolution With Python and ML

Real-world data is far from perfect. It often contains multiple records belonging to the same entity (e.g., customer, property, etc.). These records can come from multiple systems and have variations across different attributes. This makes it hard to combine them together, especially with growing data volumes. Unfortunately, unharmonized data is not fit for use in customer analytics, risk and compliance and data engineers and scientists end up building some sort of rule or heuristic based system to manage it.

This talk will cover Entity Resolution, which is also refered to as identity resolution, record linkage, deduplication or fuzzy matching. Entity Resolution helps to link and unify records that refer to the same real-world entity like customer or supplier.

This talk will cover the needs and challenges of entity resolution, and introduce open source python package Zingg(https://github.com/zinggAI/zingg) which can be used to resolve entities at scale.

We will discuss Zingg algorithms and Python API usage.

About the speaker
Amy-Heineike

Sonal Goyal

Founder at Zingg.ai

Sonal Goyal, the founder of Zingg.AI, focuses on building open-source identity resolution systems. Passionate about tech, she applies AI, UX, and Distributed Systems to solve enterprise data problems at scale. With a vision for easy-to-use and scalable data systems, Sonal leads Zingg in providing unified and trusted views of core business entities. She actively hires for software development roles and marketing interns, showcasing her commitment to growing her team and expanding Zingg’s reach.
NLP-Summit

When

Online Event: September 24, 2024

 

Contact

nlpsummit@johnsnowlabs.com

Presented by

jhonsnow_logo