Checking data like social security numbers (SSN) for correctness

Clinton Jones
Clinton Jones Experian Elite
edited December 2023 in General

Social Security Numbers, National Identity numbers and the like, on face value are simply strings of data, made up with letters or numbers and segmented with hyphens.

There is however inherent intelligence in many of these identifiers and even check digits.

You'll find regular expressions to validate these kinds of identifiers peppered all over the internet but not all regular expressions are made equal and some do a better job than others.

A commonly referenced one is ^\d{3}-\d{2}-\d{4}$ but this is incomplete.

The associated dummy data set is illustrative of this

here we can see that the 000-00-0000 SSN's are clearly wrong

Using the regex above in a match express will yield partially correct results but doesn't account for this all zeros number


Fortunately there is a better regex in the business constants that comes predelivered with Data Studio which you can leverage within the same COMPARE function, this business constant is much more elaborate.


the results speak for themselves


Tagged: