-
Data Studio Salesforce Connection
Connecting Data Studio directly to your data sources and CRM systems, like Salesforce, has never been easier. This video is to remind and inform users, just how easily two systems can work in harmony, without any complex development. We've created out of the box connectors, for a wide range of common platforms, to make…
-
Fuzzy Matching Logic
Hi, I have a requirement to match records from two separate systems on the basis of FirstName, LastName and DOB from System1 and FisrtName, MiddleName, LastName and DOB from System2 to return the match Id from System2 to System1. Below are the challenges The names are not consistent in both systems . For eg in system1 for…
-
Automations and notifications.
if I want to trigger automation if any workflow failed in that space. is that true for that case as well. I was going through some documentation. Data Quality user documentation | Automations and notifications so here i would want the same for workflow failure. any workflow fails in that space, i want the event to trigger…
-
swagger UI for find-duplicates- error. after recent upgrade.
Hi, We are seeing this error after the latest upgrade. Upgrade done on our dev is as under.Upgraded to Aperture Data Studio to version 2.15.4 and Find Duplicates Service to version 3.11.3 @Ian Hayden , @Henry Simms , Kindly help and let us know what this is about. Services are UP, but we usually dont see this error when we…
-
Invoking PowerShell script from Experian workflow/Schedule
Is there a way to invoke a PowerShell script from Experian Aperture Data studio using workflow or schedule? I am using the 2.14.14 version
-
Rows to Columns
Hi, I have been testing out the Rows the Column function. However, it seems to be not pulling through all the columns. In the DEV data set, I currently have 62 rows which I would like converted to columns. When using the Rows the column, it automatically only finds around 56. Wondering if anyone else has had a similar…
-
Flattening out / Pivoting / Transforming results
Hi, I'm having an issue flattening out results. Essentially, I have multiple rows for the same customer because they have different PII's. Is there any way of transforming / pivoting / flattening out the results to have one row for each customer? I have left a screenshot of mock-up data below to help people understand. I…
-
Compare Dates (Verbose) 📅📅
Compare Dates (Verbose) [Category: Conversion] This function provides a summary of how 2x input dates compare (and also includes simple 'convert to date' logic to aid the analysis). Output options are: Exact match Incomplete - Date 1 is missing (or not a date) Incomplete - Date 2 is missing (or not a date) Incomplete -…
-
📣 Join us at our next User Group - 9th November 2023 📅
I am pleased to confirm that our next User Group event for Aperture Data Studio users will take place on Thursday 9th November, 11am-3pm (UK-time). We will be running the event from our offices in London (near Victoria statioon) but will have a dial-in option for those who are unable to be there in person. Following on…
-
Support for VARBINARY(MAX) field type in ADS
Hello I have a user case that requires migrating Salesforce objects that contain attachments such as HTML Body, PDF's, Word, Excel and jpg's attachments to another instance of Salesforce. These are normally stored as a VARBINARY(MAX) in a MS SQL database and I think would be a Blob in SQL Lite. Given these data types are…
-
Error while invoking REST API from Experian Studio VM
Background Experian data Studio version - 2.14.14 SSL setup done on the data studio url and private CA is added to the trust store under E:\ApertureDataStudio\certificates\cacerts API Key genrerated with 365 days validity and able to invoke API via Swagger Issue While trying to invoke the CURL command from command prompt…
-
♻ Reusable workflow example – calculate word frequency
Today I’d like to share with you a reusable workflow that I built a little while ago not only as a downloadable .dmx file (see below) for you to explore/use as you wish, but also as a means of demonstrating one of the key themes of Data Studio…reusability. If you find that there are duplicate tasks being executed on your…
-
Experian Aperture Data Studio REST API error
Hi, I am planning to automate dataset creation using Data studio REST API, however I am getting 401 Authorization Error. Background Experian data Studio version - 2.14.14 SSL setup done on the data studio url and private CA is added to the trust store under E:\ApertureDataStudio\certificates\cacerts Please see the steps…
-
Group Step > Being able to select and right click view multiple rows in groups
What problem are you facing? When the Group step is used, we can only view rows in group on one value. What impact does this problem have on you/your business? Feedback received of this not being user friendly and having to exit the screen to be able to get to the multiple filtered values. Do you have any existing…
-
👩💻 The HTTP Request (REST API) step
This post is to raise awareness of an additional workflow step, that is increasingly being used to provide value in many different Aperture Data Studio deployments. This step allows workflows to call a REST API in a flexible yet well-controlled manner. What do you mean additional workflow step? When you install Data…
-
Assign latest changed id to all linked records
Hi, I have a scenario where input file is providing changed address ids in from-address and to_address columns. Need to identify the links between the records using these 2 fields and assign latest id to all the linked records. Input: event_date,from_address_id,to_address_id 20210223,120,160 20210402,120,160…
-
Error attempting to delete match store
Hi, In my project, we are using Experian Aperture Data Studio(2.10.10) to connect to a Remote Find Duplicate match store(3.8.15) over http. The Find Duplicate step is configured with the 'Clear and re-establish store' checked as part of our requirements. Majority of the time ,the process runs end to end successfully,…
-
Issues with Dev Environment Post-Upgrade
I'm reaching out to address some critical issues we've been encountering with our Dev environment following the recent upgrade. After the upgrade to the latest versions, we've been experiencing difficulties that are hindering our work. our current versions: Aperture Data Studio: 2.13.7 Find Duplicates: 3.10.2 Specifically,…
-
Use a View in the Map to target step
The Workflow step Map to target allows the schema (column order and column names) of a table of data to be quickly updated to match another data table’s schema. It is now possible to use a View as the ‘Target schema’ that is being mapped to, which is useful: to create a schema in Explore mode using Transform to share a…
-
Data Management User Group 6 March 2024: Recap and reflections
A note from Sarah Williams reflecting on the Data Management User Group that was held on 6 March 2024: Great to see so many of you join us at the March Data Management User Group, both in person and online. Every interaction with you helps us to glean rich insights, which fuel our future product innovations and we hope…
-
Managing Data Quality issues using Issue lists
Issue lists allows Users in the same Space (including read-only Consumer users) to capture problematic records, allocate owners and collaborate on a resolution. Once a data issue has been ‘fixed’ in the source system, the Workflow will automatically set the Issue to the ‘Resolved’ status. Here is a detailed article about…
-
Compare two data sources
The Compare step allows you to detect and highlight differences between two data sources, e.g. year on year sales data or monthly customer sign ups. There are 3 outputs available to help determine if changes are as expected or if not, need further investigation. Results by key: quickly see if records have changed based on…
-
ssl cert update process.
has the process changed to update ssl cert on newer version? previously i used to do this, and hit save, now i see this, and does the section for public certificate needs to be empty? you guidance will be really appreciated, as our cert is expiring in 1 month or so on our dev server. Br, HS
-
📋 Index of reusable Functions! 📇
This post simply acts as an index for all the functions shared in this library. If you would like to receive a notification whenever a new function is added, please bookmark this post by clicking on the bookmark icon to the right of the title. Current functions available: 👪 Parse Full Name - A handy set of functions that…
-
Business Validation and Enrichment
Aperture is a powerful tool when validating business information you already have. We can leverage trusted address sources, like the Royal Mail’s Postal Address File, for the UK. This will ensure that the addresses and business name information you hold is accurate and deliverable as per that trusted source. That’s all…
-
UK Postcode formatter 🏡
Summary This package contains a single function but may be updated in the future to include others. UK Postcode formatter 🏡 This function simply adds a space between the incode and the outcode for a given UK postcode (e.g. SW1A1AA becomes SW1A 1AA). See below for preview of the function definition logic and some sample…
-
🔎 Data profiling showcase
Ever since Aperture Data Studio’s pre-curser, Pandora, one of our most used features and most loved features of the software has been the profiling capabilities and in recent releases this has come on even further. Data profiling is fundamentally the process of rapidly evaluating data (during a discovery phase or…
-
Aperture Data Studio 2.12.8, Find Duplicates 3.9.1
We upgraded today our prod servers, and now we have the same situation, settings in find duplicates .ini file get overwritten and experianmatch start storing everything in C. we changed the Ini file settings to D drive, and after that we have to rename the duplicate store to get it stored in D drive again. we don t want to…
-
🦸♂️ Data Management User Group event summary
Last week we welcomed over 40 customers to our latest Data Quality User Group. Held in our London office and online the session was a chance to: hear about product updates, find out about our recent user experience research, view live demos and hear from Shawbrook Bank on how they’ve implemented Aperture Data Studio. One…
-
Use of If-Then-Else logic in Aperture Data Studio
Imagine you are asked to use Aperture Data Studio to generate a new field for your sales data. You have the input fields: Discount Code, Quantity and Price. The ask is to generate a new field “Offer Price” using the following logic: How can you do this in Aperture Data Studio? The answer is to build a custom transformation…
-
Documenting what data is available in Data Studio (to users who don't have access)
A question that cropped up earlier today was: 'how can we avoid different users connecting to the same data in different spaces?' Depending on how you've setup Data Studio and the processes around how your users work with it, this can be a bit of a challenge to tackle at the moment. However there's a solution, which is…
-
Upgrade Issue, Aperture data studio , Find Duplicates latest versions
Hi Ian, Looks like after the recent upgrade we have an issue that experianmatch path got changed to the C drive automatically, meaning it did not following the existing configuration we have which were to be this onto D drive. And even after changing the path manually on Find Duplicates ini file, it still does not update…
-
Low Disk Space issue.
Hi, We have the disk space issue once more for our data drive. We changed the index expiry settings to 8 days, previously it was 15 days, last time we resolved this. There is still huge temp folder, this is to ask you what can be removed from this folder. 1,4 TB is temp, can we remove dump***.dat files in temp folders…
-
Standardise Country 🌎
Standardise Country The function uses reference data (contained within the .dmxd file) to standardise known country aliases to the standard form (e.g. England = United Kingdom). See below for a preview of the logical definition of the function and a preview of the dataset that it comes with. If, after processing your data,…
-
Workflow is not in an executable state but no errors exist
Hi DQ Community, I have a workflow that I can't run or publish because I get error messages telling me that one or more workflow steps are invalid, or that the workflow is not in an executable state. I've been through the workflow multiple times and all the steps are valid - I can open the final step and view results, and…
-
Autonomous REST connection SQL query
How can I get a simple SQL running when everything is set up . the Autonomous REST connection is success. But when I try to execute SQL to add new data sets, I am getting this error, and experian documentation here, has not explicitly mentioned anything how to get those running. I checked documentation there, already, and…
-
Handling data from Experian Aperture Data Studio jobs
We are getting error in the execution of Experian scheduled jobs as "There is not enough space on the disk" We have installed the data from the Experian Aperture Data Studio on the E Drive (511 GB) out of which only 19 GB is remaining. Majority of the data is occupied in the E:\ApertureDataStudio\data\resource folder under…
-
How is true / false field stored in Aperture?
Note that the data below is business data and not PII. Issue: When I import the CSV file created in Aperture into PowerCurve Assisted Strategy Design, it does not recognise the values in the CSV file as being text “TRUE” or “FALSE”. If I copy the values in this column and paste them into the same column as Values, then it…
-
Want to pass certain input from API to the workflow
i am trying to call an API from an external website my agenda is to pass some IDs that will be taken as input in the workflow and the workflow will run for those IDs only and finally if i could export the result in some kind of file (csv,excel,xml)
-
Copy of FiServ Files uploaded into DA & use of the Metro2 Parser
Does anyone use Data Arc with a copy of a FiServ Metro2 File? We are setting this up as net new and it appears that FiServ does not include the RDW in the header or the file. This is limiting our ability to upload the dataset into the tool. Has anyone had any experience with this?
-
Stored Procedure invoked from PostSQL of Export step is not processing all records
hi, In my workflow, I am connecting to SQL server 2012 using Custom JDBC jar (Microsoft JDBC Jar 12.2) and the connection string looks like below…
-
Date formats...
Earlier today I received the below email from a customer which I wanted to post on here in case the answer is of use to others: I was wondering if you could help me overcome an issue I have been facing in Aperture. When uploading a file to Aperture through the “Add Dataset” section, in the “Define settings” menu there is a…
-
Error while connecting to Embedded Duplicate Store
Hi, I am getting error as below when I am trying to connect to the Embedded Duplicate store in my Data studio server(2.10.10). Please let me know if you have faced similar issue and help me with the steps to resolve this. PS: We do have a remote duplicate store(3.8.15) which is working fine and we have secured the Data…
-
Error while invoking stored procedure from Experian Aperture Data Studio using custom JDBC driver
Hi, I am using the Experian Aperture Data Studio (2.10.0.80) and have configured a custom JDBC Jar for adding Microsoft SQL Server Driver to query On premise SQL server(Microsoft SQL Server 2012 (SP4-GDR) (KB4532098) - 11.0.7493.4). With this driver, I am able to perform DML SQL queries like Select, Insert and Update.…
-
shutdown due to heartbeat?
2023-04-13 08:57:06,646 INFO c.e.d.b.BatchService [Thread-0] Shutdown due to heartbeat timeout 2023-04-13 08:57:16,218 INFO c.e.d.b.BatchRunnerProcessMonitor [batch-client-system-akka.actor.default-dispatcher-5] *** Batchrunner process terminated 2023-04-13 08:57:16,218 WARN c.e.d.b.BatchClientActor…
-
ADS, not enough permissions for page.
user is admin to the workspace. what could possibly be the reason for her not to see the dashboard in this workspace?
-
Aperture Data Studio - Find Duplicate Servers
Hello Community, Our organization has recently decided to use ADS for Data Quality tool and we are trying install ADS setup in High Availability setup in AWS Environment. We are trying to install Data Studio in a EC2 instance as per the Documentation and several Find Duplicate instances in 3 availability Zones. Help in any…
-
Keyword: Community - Data Management User Group Material 09th March 2023
It was great to meet the Experian Aperture Team and fellow users last week. Just wondering if and when the slides and any other additional material (Q&As etc) might be available and whether they'll be emailed or shared via Download? Would be really useful to walk through with colleagues that couldn't attend and to provide…
-
Future Date Check 📅 (Dynamic)
Summary This package contains 2x rules: “Is a future date” and “Not a future date” (compared to current date at time of execution) which can help assessing whether certain date values are fit for purpose: Is A Future Date? [Category: Validation] Checks that the input date is a date in the future (relative to the date at…
-
Convert Epoch Date/Time to Standard Date/Time ⌚
Convert Epoch Date/Time to Standard Date/Time ⌚ This function was inspired by this post and created to support any users exploring use of the Data Studio API as a source to generate meainginful MI (e.g. reporting on sessions/events/activties/jobs/users and more) as Data Studio's publishes times in the epochtime format…
-
📣 Join us in person at our next UKI Data Management User Group I 9th March 2023 📅
I am pleased to confirm that our next UK Data Management User Group will take place on Thursday 9 March 2023, 10.30am until 3pm London UK time, and if you are an Experian client based in the UK you can sign up today. This time we are excited to host the event at our newly refurbished offices in Cardinal Place, 80 Victoria…
-
Java pid hprof crash files.
Facing the issue in our dev server for Aperture Data Studio and Find Duplicates. In log I have found this at the same timestamp when I got the crash file in installation dir. 2022-12-14 05:20:12,255 ERROR c.e.m.a.MatchInstance [pool-157-thread-1] An unexpected error occurred when running Find Duplicates step against…
-
Java Update - Vulnerability detected on our Experian Dev/Prod Server
This version has been detected as vulnerable by - OCPU-2022-JUL: Oracle Java Critical Patch Update Advisory - July 2022 when we try to update this, I am receiving this notification, Do we need to switch to OpenJDK, or can we keep using below, and proceed with Updated, I am confused how all of this affects Jar files we have…
-
Join us Online for our next Data Management User Group | 15 September 22 !
I am pleased to confirm that our next UK Data Management User Group will take place on Thursday 15 September 2022, 10.30am until 3pm London UK time, and if you are an Experian client based in the UK you can sign up today. We’re also excited to confirm that this will be our first hybrid session. The event will be hosted at…
-
Aperture Data Studio backups
@Henry Simms , what would be the equivalent link to this according to current version. https://www.edq.com/documentation/aperture-data-studio/help/#backup this link seems to be broken, maybe it pointed to some old version or documentation. regards, HS
-
Quick Actions menu
Since v2.8 Aperture Data Studio has had a command palette that makes shortcuts and access to frequent actions more discoverable and accessible. Simply press CTRL + SHIFT + P from anywhere in the application to open the Quick Actions menu (and use the same shortcut or Esc to close the menu). The list of actions shown is…
-
Challenge 5 – Fibonacci Sequence to X
The Fibonacci Sequence is the series of numbers 0, 1, 1, 2, 3, 5, 8, 13, 21, 34,.... The next number is found by adding up the two numbers before it. - the 2 is found by adding the two numbers before it (1+1) - the 3 is found by adding the two numbers before it (1+2) - the 5 is (2+3) - and so on! Despite seemingly…
-
Does Aperture have the "treat leading zeros as alphanumeric" option that Pandora had
One a a global setting that instructed Pandora to treat any input field with leading zeros as alphanumeric so that the leading zeros were preserved. While I can see that one can set a field which has can have leading zeros as alphanumeric, this seems to require an analysis of the data to know that it has leading zeros -…
-
Repeating Characters 🔁
Check for same character repeated This function uses a regular expression to identify records where the entire value of a cell is made up of the same character repeated (e.g. "aaa" or "0000000"). The idea here is to detect cells which are clearly just entered with default values (either through automated processing, e.g.…
-
How to copy dataset header to a row within the same dataset?
Hello, I want to create an additional row on my dataset which would contain a copy of the header names, E.g. Input: Header1,Header2,Header3,Header4 ---------------------------- Headers 1,2,3,4 ----------------------------- Data 1,2,3,4 ----------------------------- Data Output Header1,Header2,Header3,Header4…
-
Extract First Word 🥇📝
This function extracts the first word in a string, in which all words are separated by space. The function returns the first word, regardless of if the string starts with space. If the string contains of only one word, it returns the word as it is received in the input. The function can handle space before the word. If in…
-
Replace the word ‘NULL’ or any non-null space values with null 🔄👻
This function replaces the word ‘Null’ (case insensitive) with a null value. For example, if the value of a column has been visible and people can read the word ‘Null’, once it is replaced this column will have an empty value. Moreover, if there are any empty values, which however are not read by Aperture Data Studio as…
-
Extract Last Word 💬🥉
Extract Last Word Version I This function extracts the last word in a string, in which all words are separated by space. If the string contains of only one word, it returns the word as it is received in the input removing any starting space. The function can handle space before the word. If in the input there is a single…
-
Aperture data studio log file location
Hi, I have noticed something now in our prod servers. The log file generated for aperture data studio is still being created in the old location along with archived log folders, where as the repository resides in the new location. repository is here right now: D:\ApertureDataStudio\data\repository and the log file is still…
-
Word Frequency
Every now and then a scenario crops up where it'd handy to know how often a given word occurs within a given dataset. For example you're profiling a reasonably standardised list of values (e.g. job titles) and you want to identify unusually common terms (like 'manager' 'executive' etc) or infrequent ones (like 'test123').…
-
Parse Date 🧽📅
Parse Date This function is designed to have flexibility to accommodate all sorts of different date formats, which can be configured by the user. In addition to mapping the input, as you can see, the user is prompted for a 'format' when using this function and this can be configured intuitively as you can see in some of…
-
Huge HPROF file in installation folder.
java_pid2376.hprof What is an HPROF file and what is it doing in my ADS installation folder. C:\Program Files\Experian\Aperture Data Studio 2.7.3 It is 35 GB. I believe it is junk and is a result of an error. @Henry Simms please comment . would it be OK to delete it or would it affect my server installation in anyway. I…
-
Challenge 4 - Cleaning product (data)
Want to test your Aperture Data Studio skills a little more? This next challenge requires you to analyse the situation, compare the starting point to the destination state and figure out what functionality needs to be applied to navigate the journey. The sample data for this challenge is some messy product data: As you can…
-
Notifications based on emails in the data
We have 50 service desks placing orders for customers around the country. We want to monitor those orders for data quality errors and notify the desk leaders of any fails which occur on an hourly basis during the working day. Currently each notification email address has to be typed in manually, so even though the desk…
-
Low Disk Space Notifications and others.
Hi team, this is to ask you if I create a low disk space notification on aperture data studio system settings. Will it notify me the diskspace of the experian match store drive or the installation drive. Second question that I have is in case of an abrupt stop or restart of my experian data studio and find duplicates…
-
Calculate the distance between two sets of co-ordinates 🌍️
Summary Returns the approximate distance in kilometres between a pair of geographical co-ordinates. Input latitude1, longitude1, latitude2, longitude2, output value is distance between each set of co-ordinates while considering the spherical nature of earth using the law of haversines. Example Buckingham Palace, London is…
-
Force workflow failure due to failing rows
Is there a way I can force a whole workflow to fail as a result of failing rows in a validation step(at desired tolerance level)? I want my workflow to stop and fail when I have a certain number of failing rows. I have email notifications set up with fire event to email me when I have failing rows but I also want the full…
-
Enabling LDAP
In case you haven't noticed, LDAP has been introduced since Aperture Data Studio v2.0.11. Go to Settings>Security to set up the LDAP properties. Once this is set up, you can test the connection to the ldap server. A message will be shown to indicate whether the connection is successful or has failed. Once the connection…
-
Issue with dates being read as American dates
I am pulling in my source data with a column that contains dates of the format dd/mm/yyyy. When annotating the columns, I have specified the column to be a date column. Now ready to explore my data, most of the dates in the column have correctly loaded in format dd/mm/yyyy. However some have converted to format dd-MMM-yyyy…
-
Next ⏭️📅 & Previous ⏮️📅 Working Day
Summary The 2x functions contained within this package relate to identifying the last working date before and next working date after given input (i.e. Excluding Weekends). Note that this function does not take into account Bank Holidays. Get Date of Next Working Day Returns the next working day after a given input (i.e.…
-
Last day / working date of month 📅
Summary The 2x functions contained within this package relate to identifying the last date of the month (and the last working day of the month) for a given input. Last Date of Month This function returns the last date of the month for a given input to help make it easier to assess any business rule logic that requires this…
-
How to promote from dev to prod environments when using different external systems
Problem: I’ve recently been working with a client who had used their Data Studio development environment to create workflows based on data they’d extracted from their development SQL Server database. They were in a position to promote those workflows to their production environment, and also point the data sources to use…
-
Dynamic Naming of Output File
Hi, Is it possible to have a dynamic text element of the naming of an output file. I understand that the naming can consist of a text and date stamp, however does the text need to be static or can it be dynamic based on content of the output file. I essentially want to extract a reporting date from the output to be…
-
Error with resuable workflow step in new workflow
Hi, I have created a workflow that produces an output and has been made reuseable. This workflow is then used as the start of a second workflow and previously the second workflow has run correctly. However now the second workflow is causing errors and when i click to see output of the first workflow within the second…
-
👋 Introduction to the Functions Library 📂
Intro This area of the Community hosts our library of re-usable functions for use in Aperture Data Studio. Whilst Aperture Data Studio comes with a wealth of native functions out-of-the-box, this area of Community has been established to provide a reference library of further re-usable functions that you can easily add to…
-
👪 Parse Full Name
Summary The functions contained within this package are all designed to help you in dealing with names data; specifically when a full name is contained within a single field. The functions, detailed below, help with resequencing names when commas are present (e.g. “Danny Roden” vs “Roden, Danny”) and then with parsing out…
-
Offensive Words 🤬
Summary The functions contained within this package all relate to flagging and dealing with data containing offensive language. All of these functions uses a domain of offensive words (contained within the .dmxd package) which contains a list of known offensive terms. Note: this data is open source and originates from:…
-
Mask Out Email/Phone 🛡️
Summary This package contains 2x functions to help with anonymising certain input values, whilst leaving an output that can still be used for non-sensitive analysis. Mask Out Email This function masks characters in the sensitive part of an input email address with 'X' characters, leaving the domain untouched (e.g.…
-
Proper Case Surname 📛
Summary This package contains 2x functions which help with contact data: Proper Case Surname and Validate Surname Casing. Proper Case Surname As the name suggests this function, produces a proper-cased Surname for a given input, taking into account some key exceptions including: Irish & Scottish names (e.g. O'Neil and…
-
Job Title Match Key 👨💼👩💼
Job Title Match Key The function in this post has been designed to help illustrate an approach (and act as a template for you to build on) to help handle inconsistencies with Job Titles found in B2B databases. In short, the function generates a key that can be used to group job titles together (despite presentation…
-
Invalid Character for Names ☹️
Summary This package contains 2x functions: Contains Only Valid Characters for Names & Contains Invalid Character for Names Contains Only Valid Characters for Names This function finds records where the field contains only characters which are valid for names. Records which contain digits, commas and other…
-
Convert Boolean ✅❌
Convert Boolean [Category: Conversion] This function converts a binary or Boolean values (i.e. true/false, pass/fail, 1/0) to a graphical emoji icon (✅ or ❌) to aid visual presentation in the UI. See below for a preview of the function definition along with some outputs for different input values: Using Data Studio's bulk…
-
Contains Non-Latin Characters 🈯
Contains Non-Latin Characters [Validation] This function identifies the presence of any characters not in the 'basic Latin' Unicode block, see here for more info: . This may be useful in verifying assumptions about data expected to be in Latin character sets only (particularly when downstream processes require it to be).…
-
Reverse String ⏪
Reverse String If you find yourself needing to reverse a string (i.e. for bespoke building of anonymisation logic or so as you can sort on suffix) then this function may be of use to you. The function simply reverses the input (as in displays it in backwards sequence) character-for-character (e.g. "Danny R" becomes "R…
-
PCI Detection 💳 (Payment Card Information)
Summary The following package is used to detect payment card information (PCI) and whether it is present or not. Contained within this .dmx package are 2x functions: Contains PCI and Does Not Contain PCI see below for a brief summary of each: Contains PCI [Category: Validation] Checks that the input does not contain either…
-
SIC Conversion 🏷️
Summary The following package is used in processing Standard Industry Classification (SIC) codes, and were designed to help with post-processing the 2007 variant of the SIC code that is output when performing UK Business data enrichment using Experian cleansing. Contained within this .dmx package are 2x functions: Convert…
-
Challenge 3 - It's a Date!
This latest challenge is all about dates. The sample dataset highlights a common challenge many of us are familiar with, which is around non-standard date formats (most often seen when working with data exports from mainframe systems or proprietary fixed width file formats). Have you ever tried running analytics on dates…
-
Connecting to a SAP HANA database using Data Studio
From the Data Studio v1.6.1 release onwards, we've made it easier to connect to a SAP HANA database directly from Data Studio. With the SAP HANA JDBC connection you can load in data from HANA to Data Studio, or export back. First, you'll need to locate the SAP HANA driver. The driver (ngdbc.jar) is installed as part of the…
-
SAML anyone?
Hi everyone. Our engineering team is currently working on a feature for Single sign on enterprise authentication with SAML. There are some technical questions that they are seeking some input on in order for us to best address our client's needs. Which Identity Provider (IdP) do you use for SAML? Does the Identity Provider…
-
Sharing of useful functions to support contact data validation
Hi all, I've found myself using a few functions on a regular basis so thought I'd share them on here in case they are of benefit to anyone else. The attached .dmx file contains the following functions: Free from PCI info (checks an input does not contain 16-digit numbers either together or separated by hyphens or spaces…
-
What is the best way to check for the presence of a number value in a field?
We often have a requirement from clients to check for numbers in a name field or for name in a phone number field. What is the easiest way to do this?
-
Challenge 2 - Title Casing Surnames
Does it bother you when you get mail through the letter box and your name is not presented with ‘standard’ casing? I did some work with a charity a few years ago and this was something they got a surprisingly high number of calls regarding. Long story short, they had a free-text form that their call center staff used to…
-
Attention all V1 & V2 users...
Today we have launched a new sub category, Migrating from V1 to V2, which contains posts to aid your transition to V2, as well as access to a migration tool. We will be adding more posts over the coming days and weeks so stay tuned for tips and tricks to aid your migration. Please note, you must be signed in to the…
-
✏️ Challenge Index 🔍
Each month we're posting a new challenge for you to have a go at to test your data skills. Challenges will vary in difficulty, data domain and tasks category (e.g. data cleansing, matching, validation, preparation etc). This page has been setup to act as a single location from which you can easily find and search through…
-
Challenge 1 - Gender Title Mismatch Detection
This first challenge is concerned with identifying records that are displaying inconsistent information. The dataset (attached) for this challenge shows title, forename, surname and gender information and your challenge is to identify records where titles and gender fields provide conflicting information (e.g. Title is…