Find Duplicate is showing empty records along with input records
We are using remote duplicate store. Whenever we run the workflow, the Find Duplicate step's result is produced the cluster id and and Match status as expected for the inputted data set. However, there are many blank records in its output, all of such records are showing their "Duplicates: Action" as "AFFECTED"; and for all the inputted records, this particular column are showing "INSERT." Please elt me know how we can avoid producing these empty records in the Find Duplicate step result.
Answers
-
Hi Arun
It sounds like you are using a fairly recent version of Data Studio as changes have been made to Find Duplicates in v2.9
https://community.experianaperture.io/discussion/919/aperture-data-studio-version-2-9-oct-2022
The Find duplicates step will insert/update records to an existing Duplicate store and provide detail on what has happened to each record. It is possible to 'Clear' the contents of an existing store, which will revert it to its un-established state and allow the store's configuration to be edited:
Hope that helps, there is a video 📺 in the comments of the latest release notes covering the recent improvements to Find duplicates: https://community.experianaperture.io/discussion/977/aperture-data-studio-version-2-10-feb-2023-inc-download-links
0