Data Studio V2.0 Find Duplicates
Hi
Thank you for the answers around using Find Duplicates Workbench. I will investigate this long term, but l am trying to do a quick rules and blocking to create the following as best l can in interim of learning to do this:
I want the duplicates to look at First Name(FORNAME) and Surname (SURNAME) as a match first
If l have Julie Victoria Smith and Julie Smith - it sees it as a possible match
Then l have 2 addresses - Residential - using Premise & Street, Locality, State and Postal Code and
Property - using same fields. I see l have to add group ID to blocking to do this, is what l have found
So if First name and surname probably match, check if addresses match next
Then look at 4 different phones and 2 different emails - So assuming l need to group these
If l add a group id, is it looking at all phones for eg. when its looking for a match?
Can l get some assistance to maybe take a copy of AU Individual Default and update this to be able to do above
Please advise
thanks
Carolyn
Answers
Hi Carolyn,
If I understand correctly, you want to introduce additional elements, such as Phone and Email to Match.Probable rule in order to make it stricter.
If it is the case, you could change the rules to something like:
Match.Probable={Name.Probable & #[Residential,Property ]Address.Probable & #[Phone1,Phone2, Phone3, Phone4].Phone.Exact & #[Email1,Email2].Phone.Exact}
You would also need to add Phone and Email element rules below the match level rules.
Phone.Exact={[ExactMatch]}
Email.Exact={[ExactMatch]}
In this case, two records would be a match if name and address match at the probable level, as well as at least one of the phone numbers and one of the emails match.
Between two candidate match records every single phone number will be compared for a potential match if Group IDs are used.
You would also need to modify Blocking Keys to include elementGroups for each address element in the blocking key, such as described in the cross field matching section of the advanced config documentation:
https://www.edq.com/documentation/aperture-data-studio/find-duplicates-step/advanced-config/
Thanks,
Katya
The new documentation URL is
Data Quality user documentation | Advanced configuration (experianaperture.io)