The results on my duplicate step are coming with some empty gaps and when I checked it all the gaps were the Action Affected in this column. Can someone explain to me what does this actually mean? Am I missing any data or this is just an extra field and I can remove all the Affected rows? There are 2 things in this column insert and affected so if I could understand a bit more about this, would be great!



    The Find duplicates step can both establish a new Duplicate store or update an existing store, which is happening in this case. You can prevent this by selecting 'Clear and re-establish store'. When updating, the column values to the left of the Cluster ID are the records you are Inserting (or Updating), but this action will naturally impact other records that are already in the store. The impacted records are shown to the right of the Cluster ID; the cluster they are in, the updated Match status, their Unique ID.

    INSERT indicates a new record. AFFECTED - "The record's cluster ID or match status has potentially changed due to the deletion of another record in the cluster" from here: https://docs.experianaperture.io/data-quality/aperture-data-studio-v2/get-started/create-a-workflow/#find-duplicates-delete

    From the small screen shot, what I am guessing has happened here is that the UniqueIDs for all records has changed (or not been specified for the records being updated), so Cluster 183 previously had one record, no matches so a match status 4, now has a similar matching record and an updated match status 1.


