Find duplicates step - why should I create custom rules?

Chris DownerChris Downer Experian Contributor

Find duplicates step - why should I create custom rules?

The find duplicate step comes with three powerful rule sets for each of our core countries. These rules are great for out-of-the-box duplication detection if you are using standard contact data that fulfils the requirements of the blocking keys and rule set.

Let’s take the Find Duplicate Individuals Rule Set. At the most basic level it will take Name and Postal Address data. You can map all your data to the Name field and the Address field and our powerful standardization engine will put them in the correct fields (for a full list of the elements that Find Duplicates will generate through the standardization process, see the advanced configuration/elements documentation)

The list of address and name elements is comprehensive, but what if a customer’s schema does not contain all the required fields for the default rules? What if a customer has bespoke elements that they wish to use in their find duplicates date set?

In these cases, a set of custom blocking keys and rules can be created in the Glossary in Data Studio.

Find Duplicates supports some non-address elements that be incorporated in to the default rule set with some custom comparators:

Any of these elements can be incorporated in to the rules, but the most flexible element is Generic_String.

Using the "Generic_String" element, customers can incorporate any field in to their custom rule set. This could be a license number, product code, national insurance number or any unique string.

You can even use Generic_String to introduce flags in to your data to do nested matching jobs (running more than one rule set in series) or improve the efficiency of cross-file matching!

Sign In or Register to comment.