Is there a way to split a column with multiple delimiters like the excel function TEXTSPLIT?

asohal
asohal Member
edited July 19 in General

I have a dataset which is semi-structured which has columns with multiple values split by different delimiters in different positions e.g
"'123',' 456'; '789' & '321' and '654'"

The excel function TEXTSPLIT can easily split the data into multiple columns easily but i'm struggling to find a function in Aperture which does the same sort of thing.

Answers

  • Josh Boxer
    Josh Boxer Administrator
    edited July 19

    I think this is probably possible. Can you provide an example input and output to explain this further?

    Rather than splitting the column you might clone the column and transform both is one initial thought on how to solve, but an example of the data structure in and desired outcome would help give a more detailed suggestion.

    Also it might be helpful to explain at a higher level what you are trying to achieve, i.e. what do you plan to do with the data once it has been split, as this might lead to a more useful suggestion. Lastly volume of data might impact the approach, but if you are talking Excel then I assume fairly small volumes.

  • asohal
    asohal Member

    Hi Josh,
    Not thought about cloning the column.

    Made some dummy data.
    Input is column 1 and output is column 2-4.

    I am looking to match the email addresses to another data source which is well structured.

  • Josh Boxer
    Josh Boxer Administrator
    edited July 19

    OK that is helpful. I dont think Excel will handle different delimiters, especially things like ' and ', so hopefully there are a manageable number of variations so that you can Replace all of these with a comma (no space).

    Rather than splitting the list into multiple columns, Concatenate all cleansed columns into a single comma separated list

    There are a number of List functions that are under-utilized but often very useful to do things like dedupe the comma separated list, count the number of items in a list and extract each value from the list:

    Matching the email address might need more info, but the Contains match Function can take your comma separated list column as the input and look these values up against the well structured (i assume single column) list you mention to return a true/false if they exist. Remove matches Function is similar but will leave you with a list of unmatched values: