Gen AI in Aperture Data Studio

Options
Josh Boxer
Josh Boxer Administrator
edited July 11 in General

Aperture Data Studio has functionality that uses several different machine learning and artificial intelligence techniques, including clustering, cosine similarity, deep learning, and more recently GenAI.

Gen AI Actions [in v3.1]

When exploring data in the grid you are able to describe what want you want to calculate or how else you might want to manipulate the data.

Examples:

  1. Show the top 10 count of sales by City in United States
  2. Calculate total sales and average sales by country rounded to two decimal places excluding France ordered largest first
  3. Compare Sales date with current date and return true if its within past week
  4. Update Email column removing everything before @ symbol
  5. Concatenate Forename and Surname columns separated by a space character into a new column called 'Full Name'
  6. Multiply the price by 10 if the quantity is 5 or fewer
  7. Remove duplicate rows

Users will need to confirm they wish to use AI (they understand that the text they have entered and the dataset column names, but none of the data!, will be shared with a third-party Gen AI model. Also they are aware that AI can make mistakes and they are responsible for ensuring anything built using AI is accurate and correct).

Within a few seconds, Aperture will show a list of Actions applied to the data to meet their description. Each of these Actions is interactive allowing the user to click through to check that a filter or transformation has been understood correctly and if necessary, make any manual adjustments ensure they are seeing the correct result.

image.png

These same Actions could be created and applied to the dataset manually, assuming the user knows which actions and functions to apply, but the process saves the user multiple clicks and keystrokes selecting options, naming new columns, renaming Actions, etc. which adds up to a significant amount of time saved. This time can be better spent checking and optimizing to get the best possible outcome.

It is also now possible to Insert an Action into the middle of a list, which can be helpful if AI has missed something from the description.

Once the list of Actions is checked and confirmed as correct a user can Save as View or newly added to Save as Workflow, which can be much more efficient than creating a Workflow from scratch.

Gen AI Functions [in v3.0]

It is possible to accelerate the building of a Function or validation rule by describing it and opting to use AI. There is also a separate option for AI to generate test values.

Examples:

  1. Ensure each book isbn is correctly formatted and that the author field is not empty and the publication year is within a realistic time period
  2. Validate a date of birth to ensure the person is over 18 years old and under 110 years old
  3. If order_date is in last 90 days and total_value is greater than $500 then discount_code must be X or Y or Z
  4. In the Email column remove everything before @ symbol and title case values
  5. Lowercase name and remove any characters that are not alphabetic
  6. Given two timestamps as input, calculate the elapsed time in hours

Users will need to confirm they wish to use AI (they understand that the text they have entered and the dataset column names, but none of the data!, will be shared with a third-party Gen AI model. Also they are aware that AI can make mistakes and they are responsible for ensuring anything built using AI is accurate and correct).

image.png

For more technical users, any scripting languages that are well-documented on the Internet (Python, SQL, DAX, etc.) will be known by the model, so the functionality could be used to quickly convert an existing scripted business rule by entering “Convert this python script: …”

This same Function could be created manually, assuming the user knows which functions to select, but the process saves the user multiple clicks and keystrokes selecting options, entering values naming Variables, etc. which adds up to a significant amount of time saved. This time can be better spent checking and optimizing to ensure the Function generates the correct results for extreme edge cases before the Functions is published and used on data.

Test parameter values

Models have been trained to be good at creating test values, so there is a separate option to take advantage of this functionality. Just using the column name such as ‘start date’, ‘email address’, ‘total sales’ etc. the model can infer the likely type of data and generate some values in the correct formats.

image.png

Security and opt out

All generative AI functionality is optional. An admin can turn off the system setting to hide it from users if this is desired (to meet internal compliance standards for example).

image.png

However, it is highly recommended as a significant accelerator for users both when learning how to use the application and when creating new objects such as Functions.

For security reasons, no data is shared outside of Aperture with any third-party AI model. The only information shared is the (natural language) text prompt entered by the user and the names of the column headers if working on a specific dataset.

The functionality is not a general chatbot, it is targeted to support with specific user actions. Again, no data is shared. This is similar to asking Chat-GPT or Co-Pilot to generate a SQL query to calculate total sales by country, with the difference that rather than showing the user a piece of script, Aperture turns the response into something interactive and visual for the user to click through to check the results meet their expectations.

More information and FAQs https://docs.experianaperture.io/more/request-for-information/aperture-data-studio#artificial-intelligence

If you have questions or feedback please ask below or reach out to your Experian contact.

Tagged: