How to Execute Parallel Workflows (without Custom Steps or Powershell)
Hi all,
The Setup:
Aperture V2.2.8 on Windows Server 2012 R2
Current Situation:
I have to run 8 different Workflows on a Daily basis, but some of those workflows share the same tables.
Current Solution:
I created a Workflow that has the "Allow auto-refresh" turned on and has all the tables that I am going to use on the daily execution. I also disabled that auto-refresh on the other 8 workflows because when I call them data should already be update by the previously mentioned workflow.
I then have a PS1 script that (using the API user) will first call the "Refresher Workflow" and then calls the other 8 Workflows in parallel (I call 2 at a time to not overload the server).
This usually works ok but sometime there are some issues with the Powershell knowing the job state of a workflow and will fail to continue running.
The Question:
Is there a way to run Execute the same thing (with the ability to run in parallel and in a specific order) but without the use of Custom Steps or Powershell scripts?
Thanks in advance!
Answers
-
@Johnnie Esquivel you can execute workflows in parallel by making them all embedded and then using a single workflow with multiple embedded workflows. What you can also consider, if they need to be executed in sequence, is using the built in scheduler in Data Studio and creating a schedule that contains the multiple workflows for execution. The workflows will follow a sequence and only execute when the preceding workflow completes.
0 -
here are details on the scheduler
Data Quality user documentation | Schedule Jobs (experianaperture.io)
0 -
here are details on reusable workflows
Data Quality user documentation | Re-use Workflows (experianaperture.io)
0 -
@Johnnie Esquivel the way I would approach this is to set up custom events in a workflow that can be used to kick off several schedules at the same time.
When an event occurs, it fires a notification, and a notification in Data Studio can starts the schedule or a workflow (or send an email). Read more about notifications and events in the documentation.
In this case, we can depict a scenario similar to yours like this:
After running the Refresh Data workflow, you want to have two “branches” of execution running workflows in parallel, branch A (workflows A1 and A2) and branch B (B1 and B2).
Although you don’t describe it in your case, it’s quite common that you’ll want a final workflow C to run only once when all workflows on both branches have executed.
To set this up, we’ll first group the workflows we want to run in series into Schedules. Schedules aren’t absolutely necessary here but they’ll make it simpler to manage the process. Create three schedules:
- Initialize: This will be the schedule you run to kick off the data refresh process and control the parallel execution
- Schedule A: The workflows A1, A2…, An)
- Schedule B: Workflows B1, B2…, Bn)
The Controller workflow will contain the Fire Event steps that are going to fire the notifications to run each “branch” schedule:
We then set up Notifications that run schedules A and B when the respective custom events occur.
To run Workflow C, we can create a third notification which fires when workflows A2 and B2 complete:
You'll end up with three notifications:
And three schedules. Note that I've made "Initialize" a re-occurring schedule, which will run automatically at midnight to kick off the process.
Here's the end result. The controller starts the execution for the A and B branches, and workflows A1 and B1 start at the same time and run in parallel:
The whole thing is easy to extend, with by adding more branches (for more parallelization) or by adding more workflows into the A and B branches.
There's one caveat here, which is that all the workflows must be in the same Space.
1 -
@Henry Simms Looks like a good approach, one question doesn't this assume that A2 and B2 will finish within 1 minute of each other, wouldn't you need to set the Time Period to a reasonable gap to ensure it will always execute?
0 -
@Ian Buckle Yes you're absolutely right. In my "trigger workflow C" schedule, I should probably leave a much longer time period, otherwise the notification won't fire and Workflow C won't be run.
I'd also possibly want to consider using the notification's "trigger if expired" option to always run Workflow C, as long as one of workflows A2 and B2 has completed
0