Exact match, fuzzy match and de-duplication with Find Duplicates
Sueann See
Experian Super Contributor
Hi everyone,
I'm starting a series of articles all about matching and linking records to find duplicates in Aperture Data Studio with the intention to encourage some learning and interaction.
Start here:
Simple ways to identify and resolve duplicated data in Aperture Data studio:
- Exact match with List functions
- Exact match with Group Step
- Best record with Harmonize Duplicates Step
More advance techniques for exact and fuzzy matching:
More on Find Duplicates Blocking Keys and Rules:
- Quick introduction to Find Duplicates Blocking Keys and Rules
- How to identify Find Duplicates Blocking Keys
- Building Rules
- Tips and tricks to make Find Duplicates Blocking Keys and Rules more readable
- Maximum cluster limit
- Tuning Blocking Keys
- Review Find Duplicates step results, compare and visualize rules
- Tune Rules with Find Duplicates Workbench machine learning
- The chaining effect
More on optimizing de-duplication efforts:
- Find Duplicates with phonetic comparators
- Standardize names and addresses
- Identifying two names in a single string with the DelimitedField filter
If you have any suggestions for the next article, feel free to share.
Tagged:
3