Incrementally Loading Datasets
I've thought through a few different ways that I can use to incrementally load a dataset using batches in the dataset and views on the source side. However, I'm wondering if anyone else has done some work around that and would be willing to share their thoughts / ideas.
Essentially we have some really large tables which we want to use in workflows, potentially as a whole, but maybe only for a subset or sample. Rather than wait the full time for these tables to load, we want to just append the latest data.
For example, we might have a table with 1.2 billion records that covers the span of 12 months. Averages 100 million a month or somewhere over 3 million a day. Reloading this entire dataset each day may not even be possible; might take more than 24 hours. But appending 3 million records each day might just take 15 minutes or so and a much more reasonable time to wait for a dataset refresh before running workflows against it.
Thanks in advance,