Group Step Sum of Nulls should return Null

What problem are you facing?

When getting the Sum of a column via a Group Step, if all the values considered in the aggregation are Null, the Sum presently returns zero. This makes it impossible to distinguish whether all the values were Null or if the data contained numbers with a sum of zero. This distinction may matter. Furthermore, this behaviour is inconsistent with Aperture's own functions; using Add or Sum Digits returns "Invalid number -" when applied to only Null values.

What impact does this problem have on you/your business?

This impacts data quality in that it effectively results in a loss of information - when a Sum is applied to a column and returns zero, I am not able to determine whether the column contained only Nulls, or whether it contained only zeroes, or whether it contained both positive and negative numbers which added to zero (which may be a relatively common use-case e.g. one could imagine debt and repayments being applied to a credit account may sum to zero if paid in full).

It is my understanding that in SQL the sum of Nulls returns Null whereas in Excel the sum of Nulls returns zero. I would argue that when faced with these competing approaches, Aperture should align with that which maximises data quality - and in my view this is the approach that minimises the loss of information. If needed, any Nulls resulting from a Sum aggregation with my proposed change could later be converted to zero, and we would be back to what we have now. However, there is no such option currently to do the reverse - we lose the information almost irreversibly by having the Nulls sum to zero unless a workaround is used.

Do you have any existing workarounds? If so, please describe those.

Yes - my workaround is that in the Group Step where the Sum is being calculated, I additionally include a Count Excluding Nulls aggregation for any column that is being summed. Following the Group Step, I add a Transform Step with an Advanced Function applied to the result of the Sum. This function checks whether the result of the Count Excluding Nulls is zero (i.e. all values were Null), and if it is zero, the Sum is set to Null (else it is left at its current value).

Do you have any suggestions to solve the problem? Feel free to add images if this helps.

Ensure that the Sum of Nulls, as returned by the Group Step, is Null (or otherwise Invalid) so that the behaviour of the Group Step Sum is consistent with that of the Add and Sum Digits functions. Alternatively, users could be given a choice of whether they want the Sum of Nulls to return Null or zero - this could be toggled via a checkbox in the "Aggregate details" when editing a Sum aggregate, in much the same way as the "Distinct" option is currently a checkbox.

Tagged:
6
6 votes

Gathering interest · Last Updated