How to auto updated datasets when definition changed?

abhish
abhish Member
edited December 2023 in General

Hello,

Is there a way to automatically change the dataset definition when the external table changes? One of the datasets failed to load fresh data last night because the column definition changed.

Comments

  • Josh Boxer
    Josh Boxer Administrator

    Hi Abhish

    It depends how the table definition changed

    A Dataset has a set column schema that it expects the next batch of data to adhere to. If a column may be missing in future (and you don't think this will be an issue for whatever you are doing with the data) then you can set the column(s) as Optional when creating the Dataset or if it already exists then select Options > 'Annotate columns':

    If columns might be added in future then rather than bringing in a table you could use SQL to specify only the required columns from the table: https://docs.experianaperture.io/data-quality/aperture-data-studio-v2/get-started/configure-external-systems/#loaddatausingasqlquery

    If the names of the columns change then I cannot think of a way to solve this automatically. If you used SQL to specify the columns to include then you could manually amend the SQL with the new column names, which will allow you to add these columns to the Dataset: