Why might my Find Duplicates results look different?

Clinton JonesClinton Jones Experian Elite
edited September 2019 in General discussion

If I use the Find Duplicates step on its own, in some instances I get more clusters of records (clusterIDs) than if I use Data Studio with the Address matching step.

I attach an example that illustrates this using the test data delivered with the application

Why would that be?


  • The Validate Address step automatically generates "Find Duplicates Data" when you execute it (for supported regions). This takes the best address fields from the raw and validated address.

    When you attach a Find Duplicates step to a Validate Address step, it will automatically use the best address data generated by the Validate Address step.

    This can potentially lead to you getting different results than you would if not using the Validate Address set (or indeed, just the output from the validate address columns)

