Scheduling in Aperture Data Studio Version 1 vs Version 2
Many Aperture Data Studio customers process large volumes of data. Workflows are likely to be scheduled as overnight jobs or scheduled to be run over the weekends so that much of the waiting time happens outside the core business hours. There may also be a need to run multiple workflows, one after another automatically.
In Aperture Data Studio Version 1, you can configure a workflow to be run in the future and at defined intervals if required. However, there are some limitations as follows:
- The schedule is only applicable per workflow. If you have a set of workflows to be run one after a another, you will need to schedule each of the workflows with a fixed start time or interval. This of course requires you to have prior knowledge of how long a workflow will take to complete. It can be quite challenging especially if you are expecting varying volumes of data like in the case of processing delta files. You may have to observe a series of workflow runs before you can confidently schedule them.
- You are not able to specify recurrence for specific days in a week. For example, if you would like the workflow to be run on Saturdays as well as on Sundays, you will have to set the schedule twice, each with a different Start time but both having an interval of 1 week.
In Version 2, we have significantly improved this experience with a new Scheduling component.
- Schedules can be used to automatically run a Workflow, or several Workflows in a specific order, at regular intervals. Workflows specified in the schedule will be executed in sequence whereby if one workflow fails, the subsequent one will not be run. This gives you the assurance that a workflow will not be run in error if there is an issue with the preceding workflow.
- You can easily specify the recurrence pattern and even use a CRON expression as a shorthand way of defining a recurring schedule.
- In addition, you can trigger one or more Schedules to be run based on a Notification event.
One common scenario would be to use the Dataset Loaded event to trigger a Schedule once the source dataset has been loaded.
Read about the differences in triggering workflow executions in Aperture Data Studio Version 1 vs Version 2 here.