🔎 Data profiling showcase

Danny Roden
Danny Roden Administrator
edited December 2023 in Feature demos

Ever since Aperture Data Studio’s pre-curser, Pandora, one of our most used features and most loved features of the software has been the profiling capabilities and in recent releases this has come on even further.

Data profiling is fundamentally the process of rapidly evaluating data (during a discovery phase or pre-migration impact assessment) to:

  1. Identify quick answers to common questions you have to ask of it (E.g. how well populated are the fields, how consistently structured are the values/formats and how much can I trust it?). Data Studio provides this through out-of-the-box profile reports that can be ran on any grid of data (even midway through workflow design/exploration)
  2. Verify assumptions in the data and ideally identify the ‘unknown unknowns’ that, if left undiscovered, could cause issues in onward analytics/matching/migration activity. Data Studio provides this through proactive outlier analysis (surfacing the presence of statistical anomalies)
  3. Check dependencies and relationships to ensure references are consistent (E.g. pricing is consistent for transactions relating to the same product, gender infers title etc). Data Studio provides this to ‘relationship’ analysis in the explore mode.

In the below video I run through a quick demonstration focused on each of the 3x above aspects of profiling and also highlight some neat features in the software that makes this whole process intuitive and fully customisable:

If you have any questions, comments or further thoughts, please leave me a comment below!

Danny

Comments