Data Discrepancy Comparison Post-Migration

Does Aperture have an functionality that compares two datasets and highlights any discrepancies between them on a row-by-row and column-by-column basis. I've tried the compare datasets step however it does not highlights the row/column. The scenario is that data has been migrated from an old system to a new system, and as part of our UAT we need to reconcile the new system data with the transformed data provided for migration.

Specifically, the function should:

  1. Take two datasets as input: the original dataset and the migrated dataset.
  2. Compare each row and each column of the two datasets.
  3. Identify and highlight discrepancies, such as differences in cell values.
  4. Provide a summary of discrepancies, indicating which rows and columns have mismatches.

Many thank in anticipation

Best regards,

Marco

Tagged:

Best Answers

Answers

  • @Josh Boxer Thanks for the response. We are now at version 2.12.12.90 which is the reason why I cannot get the results.

  • Marco_13112001
    edited May 2024

    @Josh Boxer We have upgraded to the latest version and now I can use the the compare step as suggested. However, I'm having issues to setup the step and I would like your help to clarify some points. Is my understating is that the key should be an identification value common in both datasets i.e., we are using UPRNS as the key, but I getting the results as duplication and the record is removed from the analysis. Is my understanding wrong?

  • Josh Boxer
    Josh Boxer Administrator

    The key is the identifier of the row/record to be compared. It should appear only once in each source.

  • @Josh Boxer If the key is repeated within the datasets the compare step will not work? the reason for my question is that the source and target datasets contains multiple records related with the same UPRN.

  • Josh Boxer
    Josh Boxer Administrator

    If these rows are identical then deduplicate (Group) the data before comparing. If they are not identical then how would you check that the rows are the same, i.e. that UPRN1 in datasetA is being compared to the correct UPRN1 row in datasetB?