A way to avoid writing zero-row batches to Datasets

Henry Simms · February 21

What problem are you facing?

I have a workflow that runs regularly, and includes a filter step that only retains "error condition" records. These records are then written into a multi-batch snapshot using Take Snapshot

The problem is that most of the time, there are no error conditions, meaning there are no rows to write to my Dataset. However, the Dataset still gets updated with a 0-row batch:

What impact does this problem have on you/your business?

While 0-row batches don't inherently cause a problem, they do cause some issues:

Several automations we have in place are triggered when the dataset updates. these are triggered even when the dataset is update with a new 0-row batch
I have observed slower performance when interacting with a datasets that has many batches, even if most of these batches are 0-row.

Do you have any existing workarounds? If so, please describe those.

One thing I have tried is to use the "Only fire if input has rows" feature of the Fire Event step to somehow orchestrate the writing of a batch only if it's input has one or more rows, but implementing this significantly increased the complexity of my solution, making it more brittle and harder to maintain.

Another proposed workaround is a clean-up workflow which deletes batches containing 0 rows run on a schedule.

Do you have any suggestions to solve the problem? Feel free to add images if this helps.

Yes - a suggestion would be to include a "Only write if input has rows" option to the Take Snapshot step, as a way to prevent zero-row batches from being written

Josh Boxer · February 21

We are doing this for Export step currently, so could look at Take snapshot if there is interest in this

A way to avoid writing zero-row batches to Datasets

Now available · Last Updated May 9

Comments

Categories