I have a dataset that has the following fields (as a subset):
First Name Middle Names Surname Address1 Address 2 Locality Postcode Country DOB
Julie Joan Smith 1 Smith St
We are commencing data cleansing and from a more reliable source, as part of find duplicates and clustering we found the following:
Source First Name Middle Names Surname Address1 Address 2 Locality Postcode Country DOB
Finance Julie Joan Smith 1 Smith St
Sales Julie Joan Smith 1 Smith St Geelong 3220 Aust 01/01/1975
We will request Finance to update record to have the correct data as per Sales
After each extraction and dataload, we want to be able to monitor that the data in finance is being updated and start to use this data on a dashboard to show that existing data didnt have locality, postcode etc. and as they cleanse and update the data we can show a percentage of complete data sets.
Is there an easy way from one dataload to another to monitor this and show as an output the changes in the data.