Best approach for this set-up?

Hi,

Any recommendations for this configuration? Lets say I have a space called 'PROD-DQR' that is used to access production data with a read-only credential to the source DB. It validates data, sends to issues etc. Only DBAs are allowed to perform updates in production, so was thinking:

New spaced called 'PROD-UPDATE' that only a DBA has access to along with an update credential for the source DB (external system). Workflow that performs data corrections/updates are created in 'PROD-DQR' and once ready, the workflow is exported from there and imported into 'PROD-UPDATE'. The workflow is changed to point to the production DB using the update credential. Then the DBA runs the workflow.

Thanks

Answers

  • Hi, wondered if anyone had any suggestions or feedback? Thanks

  • Hi James, a few thoughts.

    Firstly, because users can be assigned credential permissions (https://docs.experianaperture.io/data-quality/aperture-data-studio-v2/get-started/configure-external-systems/#connection-credentials_credentialpermissiontypes), you could just avoid having two separate spaces and instead only allow your DBA user to use the DB "write" credential. So essentially moving the access control from space level to credential level.

    Avoiding separate spaces removes the need for import / export and credential remapping, which seem like activities that could risk human error.

    Secondly, I would think about splitting the "data corrections" workflow from the "Export (database update)" workflow. In this way, data corrections could be written to a snapshot by any user, but only the DBA could successfully execute the export workflow (which updates the snapshot contents back to the DB).

  • Hi Henry - the credential approach was my first thought too, but after some initial testing it looked like it would not work - the workflow that does the update and uses the external system DBA credential can still be executed by the user that does not have access to the external system DBA credential.

  • henry_
    edited November 26

    You're right, I hadn't realised it but the Credential permission really refers to whether a given user can use the Cred in workflow creation, or dataset creation, not workflow execution or dataset refresh where the credential has already been configured by a user who does have access.

    This is the expected behaviour, but I had forgotten how it worked.

    Given this, the two-space solution makes sense. However you should still be able to streamline this and avoid the workflow import / sync by creating the "dataset to be updated" in the PROD-DQR space, and then sharing this via a view to the PROD-UPDATE space. Then the UPDATE space would only need a simple Dataset → Export workflow using the DBA credential