Dynamic DataSet Naming

Manoj Bhosle
Manoj Bhosle Member
edited December 2023 in General

Hi,

We are trying to use Aperture V2 as part of the ETL workflow. We need to pick the data set from a landing zone. The challenge is that the incoming file name follows a pattern i.e. static text followed by Date/Time stamp.

Aperture workflows pick up Dataset with fixed name only. It allows to change the file name while running in manual mode. Does anyone know how to overcome this issue.

Is there a way, I can name a dataset to accept wild characters.

Answers

  • Clinton Jones
    Clinton Jones Experian Elite

    HI @Manoj Bhosle are you using dropzones or the API method?

    If you are using dropzones, then the dropzone doesn't care about the naming of the file, only the correctness of the schema/structure of the file.

    Data Quality user documentation | Configure Dataset dropzone (experianaperture.io)

    The following conditions must be met for the file to be loaded:

    • The Dataset must have originally been created from a file upload (the System will be "Imported file").
    • The file in the dropzone must have the same extension as the file used to create the Dataset.
    • The new file must have at least column names in common with the existing Dataset. Missing columns in the new file will be loaded as null values in the Dataset. Additional columns will be ignored.


  • Hi @Clinton Jones - Thanks for quick response.

    We are using API to trigger the workflow from Azure ADF. The files arrive on Azure Storage.

  • Clinton Jones
    Clinton Jones Experian Elite

    @Manoj Bhosle which version of Data Studio are you using?

    If you're using an API and not using the native dropzone capability - what is calling the workflow - something custom that you have running on Azure?

  • Clinton Jones
    Clinton Jones Experian Elite

    @Manoj Bhosle my expectation is the Azure Blob storage should be working the same as regular dropzones

    Data Quality user documentation | Auto-refresh Sources (experianaperture.io)

    If it isn't either you're on an old version or there is something else going on.

  • Hi @Clinton Jones

    We are using Aperture 2.4.0.8. Azure ADF calls the Aperture Workflow using an API call. We are able to run the workflow and the workflow is able to pick up the file stored on the Azure Blob Storage.

    Our issue then is the Workflow always expects a static file name. we need a way where each time the workflow is executed, it is able to pick up a file name with different date suffix.

    Hope I am able to explain the situation.

    Many Thanks.

  • Clinton Jones
    Clinton Jones Experian Elite

    How do you have the API call configured and what are you using to invoke the API, powershell?

  • Clinton Jones
    Clinton Jones Experian Elite

    also...have you configured your workflow to prompt for a source name or have you fixed the source?

  • Azure ADF ( Data factory) calls the API and triggers the Aperture workflow. ADF has inbuilt capacity to call any APIs. This part is working correctly.

    We have a fixed source name. we have not tried with prompt option.

    I will go through this document reference - Data Quality user documentation | Auto-refresh Sources (experianaperture.io) and see if that resolves our problem.

    Many Thanks.

  • Clinton Jones
    Clinton Jones Experian Elite

    @Manoj Bhosle something to keep in mind, is that if you use the dropzones feature in datastudio, the data will be automatically parsed and depending on how you configure your workflow it will automatically run if you configure it to run based on a source change, so it may not be necessary to use APIs at all.