How to promote from dev to prod environments when using different external systems
I’ve recently been working with a client who had used their Data Studio development environment to create workflows based on data they’d extracted from their development SQL Server database. They were in a position to promote those workflows to their production environment, and also point the data sources to use the live data from their production SQL Server DB (a separate instance to the dev database, with different credentials).
They wanted to ensure that they followed the right steps so that any future updates to their workflows in the dev environment could easily be synchronized to production, without the need to edit workflows once or reconfigure the external system every time.
Data Studio’s synchronization functionality allows you to do this, but it’s important to follow the steps laid out below.
To do this, we’ll make use of Data Studio’s metadata management capability: the ability to export any of the logic (dataset schemas, views, workflows, functions, external systems etc.) from one space to another. Note that no actual data will be exported or imported.
Environment 1 (DEV):
- External System connection to Azure SQL Server dev instance w/ dev credentials
- Space 1: Data Import
- Used to control access to source data
- Two datasets loaded from external system
- Two views created from these which are shared into…
- Space 2: Data Usage
- Contains workflow(s) that process the data in the views and export back to the external system
Environment 2 (PROD):
- External System connection to different SQL Server DB (it must be using the same JDBC driver)
- Identical Space / Dataset / View / Workflow structure
- Intention is to preserve links between dev and prod systems (to allow regular synchronization) without having to reconfigure the external system / credentials each time
- In order to get into this desired state, the first thing you will need to do will be a full export of the contents of each space that you want to transfer from the DEV environment - including the external system & credential at least once (for my example, I export the external system and credential from the ‘Data Import’ space and just the workflow from the ‘Data Usage’ space - see screenshots at the end for the specific exports I configured)
- Ensure that the same driver used for the DEV environment external system is installed in the PROD environment
- Create spaces in the PROD environment to synchronize these .dmx files into. You many choose to use the same space names in Dev and Prod, but this isn’t a requirement. We would recommend using the same space structure
- Synchronize the space export that contains the external system and credential first, followed by the remaining spaces.
- NB Synchronize retains the link between the internal source repository UUID and the target internal repository UUID. Import will always import objects as new and not retain these links between systems
- Inspecting your spaces now should show everything is as it was in the DEV environment (including shared views / datasets / workflows etc.) with no errors reported in the workflows
- Finally, update the external system's connection properties to point to the PROD database, and fix up the credential used to to access the external system. It’s important at this stage to edit the existing External system and credential that you just synchronized, rather than creating new ones, as this maintains the link between the Datasets and their source system. Refreshing the data in your newly synchronized Datasets will now import data from the relevant tables in the production DB.
At this point, you should have a working copy of everything from the DEV environment set-up in your PROD environment using your new external system / credential.
For the next planned promotion between DEV and PROD, all objects that need to be updated should be included in the export EXCEPT any external systems that have been previously updated in the steps outlined above (any new external systems will need to be included and handled as in the steps above)
You will observe that on sync, any links (e.g. datasets / export steps) to the external system in DEV will automatically point to the (same but re-configured) new external system in the PROD environment without having to re-configure these every time.
Initial DEV 'Data Import' space export screen:
Initial DEV 'Data Usage' space export screen: