Select first three records in each cluster
Hi
After running a Find Duplicates step at individual level I can see that we have many contact names for some business, so I want to cap these to three contacts for each business address.
I'm thinking I should run another Find Duplicates step with the Household step setting to get clusters of businesses, but from there I'm not sure how to select just three from each cluster. In SQL Server I have used row_number over partition to achieve a similar goal, does anyone know how I might do this in Aperture please?
Thanks
Luke
Answers
-
Hi Luke
You can add the row number using the Function 'Current Row' https://docs.experianaperture.io/data-quality/aperture-data-studio-v2/get-started/create-functions/#dynamic-reference~native-functions
There is a recent post here that might be useful:
There are a couple of different approaches to append/calculate and 'Occurrence' column
Once that column is calculated you can filter to only include Occurrences less than N
1 -
Hi Josh
With some ideas from the post I was able to do this...
Many Thanks
Luke
1