# How to determine Nth occurence in a column?

Hello.

Do you know of any function or combination of functions that can achieve a new column in the data as the "occurrence" column in the screenshot below?

I would like to know if the ID in the first column is the 1st, 2nd ... Nth occurrence in that column. Is there a way to achieve this?

Richard

Hi Richard,

I came up with a few solutions yesterday (one involving a Cartesian Join, another involving Group/Join) but also came up with something that, like yours, involves a Lookup but doesn't require Get Cell and preserves sort order.

(See workflow screenshot below)

Step 1 "Add Source Row Number" - add current row number to the source (call the column "source row")

Step 2 - "Sort by Value" - sort this ascending by the value

Step 3 - feed this into 1) the first input of a Lookup step 2) "Add Row Number" a transform step that adds the row number again ("sorted row number") and feed that into the second input of the Lookup step ("Value -> First Row"). Add a lookup definition that matches Value == Value and returns the first value of "sorted row number". Call this definition "first row"

Step 4 - "Compute Occurrence" - a transform step to (like your solution) subtract "first row" from the current row number (this results in a zero-indexed nth occurence) and call this column "occurrence"

Step 5 - "Restore Original Order" - restore the original sort order by sorting ascending on "source row"

Step 6 - "Hide Columns and Adjust" - hide unnecessary columns and add 1 to "occurrence" to make it 1-indexed

In your solution, if you lookup and return first value (which would be a row number) you don't need Get Cell to mark the change in value.

Ian

• Hi Richard

There are two similar values that you could calculate more easily

1. the total number of occurrences for each ID using a Group
2. the First or Last occurrence of each ID using a Lookup step

Assuming neither of those meet your needs I think it would be possible using multiple functions that utilize the Function Get cell however it would require the data to be pre-sorted (by ID in this case), would that be an issue for any subsequent operations?

• Hi Josh,

Thanks for your comment. The aim is to assign a new, unique sequential ID based on the original ID, so sorting won't be an issue.

After posting the question we've been experimenting a bit with Get Cell as you suggested and what seems to work is the following:

1. Sort
2. Mark the first appearance of a value in a new column (with an X) and also add CurrentRow in another
3. Do a lookup with this table on itself, by returning the Row number where we have the ID and the mark
4. Subtract the (LookupResult-1) from the CurrentRow to get this sequential numbering