Replace Matches function returning custom columns?

Richard Illyes
edited December 2023 in General

Hello.

I have a large table in which I would like to replace certain values with a lookup from a static dataset. (Example below, I'd like to replace the "..." placeholder values with the actual correct ones from the static dataset)

Basically what I would need is the option to choose the return column dynamically so that I don't have to write the whole function for tens of different columns, just maybe make 1 custom function which references my static dataset, and I only need to select the value to return (since my lookup table,key and lookup column would be the same every time, and a lookup key you can define to be anything anyway.).

I could theoretically use a join and make a custom IF function to achieve this, but it would need some cleanup of the data, and I really have a lot of columns to do this on so I'd prefer to keep my step results clean instead of adding 20-30 extra columns only to then throw them out.

Is there an easy way to do this using a custom lookup function or should I resort to joining?



Answers

  • Josh Boxer
    Josh Boxer Administrator
    edited December 2022

    Hi Richard

    I think the way to do this is to isolate all of the '...' values so that they can be replaced

    1. use the Columns to rows step to turn the table to fix into a three column list, then Split step for rows that equal '...'
      1. Key | Attribute (column name) | Value
    2. do the same to the static dataset to turn it into a lookup table (or if it never changes then amend before creating Dataset)
    3. add a third column to both datasets that concatenates the Key and Attribute (column name) values
    4. Lookup step to search the static table concatenate column for the fix table concat column and return the Value column
    5. Transform to cleanup the fixed data removing the '...' values column, maybe rename columns
    6. Union fixed data and the unchanged data that was split
    7. Rows to columns to return the data to a wide structure
    8. Maybe a sort to order to the original
  • Danny Roden
    Danny Roden Administrator

    Hi @Richard Illyes,

    Thanks for your post. I've just had a stab at this using @Josh Boxer's recommendation as a starting point and was able to get this to work. To bring this to life a bit I've created a .dmxd file you can import to explore the solution for yourself, and have summarised the steps in this post too:

    • Use Columns to Rows to flip the data into a 2x columns table (do this for both the source data and the reference file)
    • Next use a Lookup to get the values from the reference table alongside the data to fix, and then use a Transform to replace your default value "..." with the value from the reference table:


    Then use the Rows to columns to flip the data back so it looks like this:

    And then finally use a Group and a Sort to present it in a more suitable manner:


    To help understand this more, I've attached the .dmxd file for my solution which you should be able to easily import into your instance of Data Studio (note that you'll need to be running version 2.9.4 or above).