Find Duplicates Step configuration

What is the difference between the Name, Household and Address/Location choices in the default Find Duplicates step?



  • Akshay DavisAkshay Davis Administrator

    Name, Household and Location are the most common default combination of attributes of name and address.

    Location: This only considers address elements, and will compare records on the similarity of only their addresses.

    Household: A household is defined as members of the same family, defined by the same surname, living at the same address. These rules look for records with the same addresses and surnames.

    Individual: An individual is rule is defined as records representing the same person. As only two attributes are used, name and address, both of these need to correlate for a match to occur. These rules do allow for a lower confidence match in the case of a change of name for someone at the same address.

    Addresses can be complex to compare and classify, and being able to accurately classify whether two addresses are similar enough for automatic harmonization, or require manual review, varies by country and the significant elements for that country. An example would be the significance of a postal code. In the UK a postal code resolves to a small number of addresses and is an element most people know accurately. On the other hand, in the US a ZIP code covers a much larger area so has a lower significance than the UK. The address rules supplied with Aperture Data Studio account for these.

