Is there an easy way to do this partitioning?

James T Sidebotham · May 23

Hi,

I have seen previous posts on partitioning, and use that technique regularly. However, my need this time is slightly different. I have groupings of duplicates, but I would like to count incrementally when I come across a new identifier. I actually have this working now, but it took some Aperture kung-fu, so wondering if anyone has a simple way to do this.

The column I am focusing on to create is ID_NUM, I only want it to increment within the GROUP when a new IDENTIFIER is found, otherwise keep the existing ID_NUM. (ID_TOT shows how many unique IDs there are in the group.)

Thanks

GROUP	IDENTIFIER	ID_TOT	ID_NUM
AAAA	12345	4	1
AAAA	99999	4	2
AAAA	99999	4	2
AAAA	54321	4	3
AAAA	11111	4	4
AAAA	11111	4	4
BBBB	90210	3	1
BBBB	90210	3	1
BBBB	77777	3	2
BBBB	13579	3	3
BBBB	13579	3	3

Josh Boxer · May 27

Take a look at this discussion:

https://community.experianaperture.io/discussion/837/

If that does not help then might useful to share more detail on how you approached and where you'd like it to be simpler

James T Sidebotham · May 27

Thanks Josh, I had not seen that discussion in my forum search. The two that I used for help were:

https://community.experianaperture.io/discussion/244/partition-values-dynamic-sql-partition-row-number#latest

https://community.experianaperture.io/discussion/1330/previous-row-value

The steps that I did were:

Create a new column "Sort-row" from Row number (to be used by sort later to preserve the order)
Group on: "Group", "Identifier" and aggregate on first populated value of "Sort-row"
Sort the result (7 rows) by "Sort-row"
Add new column "Row#2" from Row number
Use a Lookup to get the row number where the partition (Group) starts
Calculate the row increments (Part-row-incr) within the partition (Add(Subtract([c:ROW#2], [c:PART-ROW-START]),1))
Do a Final lookup using the original data (11 rows) to the 7 rows from step 6. Lookup type=First, Return Column= "part-row-incr" where group=group and identifier=identifier

Looking at the thread you provided, I don't think I could do this much simpler. I did however have another method of creating a key of "Group+Identifier" and using Harmonize Duplicates rather than the grouping from step 2 above. Both work fine, I just don't know which method would be more efficient with 100,000s of duplicates and groupings - I haven't tested that yet. Any thoughts on that?

Thanks

James T Sidebotham · May 27

Data Studio Export - PART DEMO 2 METHODS - 27 May 2025 09-38-53-604.dmxd

Workflow if interested

James T Sidebotham · May 28

@Josh Boxer - any thoughts on the Harmonize vs Grouping method?

Also, if Aperture had a step similar to this mockup, I would be very happy!! It would add in 2 columns for Count and Row based on the column names/requirements I gave above. (I am sure adding other functions as well, such as average, sum etc would be useful).

Thanks

Ian Buckle · May 29

Many moons ago i did write a partition custom step, so i think it should be possible to do within the SDK.

I'm not really sure of the status of this step and whether it was maintained for later versions though.

James T Sidebotham · May 29

That's great Ian!! Seems like a native step would be used by others.

Is there an easy way to do this partitioning?

Answers

Categories