Updating Data via Aperture vs SQL
Hi,
Was hoping fellow-customers or Experian staff would be able to give their experiences with this one. Wanting to update data in the DB directly using Aperture (thru the Export step - JDBC), but the DBA wants updates done via SQL scripting (is hesitant to use Aperture). I am pushing for Aperture as it is a controlled, rules-based environment. Any thoughts / inputs from this group?
Thanks
Answers
-
Hi James, where does your DBA see the main benefit for SQL (performance, control, flexibility, something else) and what are their objections to Data Studio?
0 -
I think it's control. knowing exactly what is happening. I suggested to include the log showing the updates in a test env to get approval. Just hoped that there might be an anecdote or comparison on why using Aperture is low-risk and a preferred choice.
0 -
For me, Data Studio with JDBC is a preferred choice for much of the SQL-based database updates for a few reasons:
- Ability to validate data against the target schema in a workflow before attempting and DB updates. I can use a Validate step to check data types, nulls, uniqueness, lengths etc. to construct a set of data that should be able to enter my database without errors. I can even use the Suggest Validation Rules feature against my existing data to speed up rule creation, or apply an existing ruleset.
- Control over how data is updated and how update errors are handled. Depending on the use-case I use a variety of export step options, including "Continue Workflow execution on JDBC exports error" with "Show failing rows" when I want to specifically capture and handle any insert or update failures. Using events and notifications I can get Data Studio to proactively alert on errored export rows and trigger downstream processes. I can also usually get detailed logging from the drivers to indicate exactly what calls are being made to the database, through simple config.
- Features like authentication and data security. The JDBC driver will (in most cases) have built-in and well documented settings to allow you to easily configure the authentication mechanism and data encryption in transit required by your organization. This means I don't have to worry about handling the movement of data to and from the script execution layer. Data Studio also handles credentials storage and access on a per-user or per-group basis.
A scenario I have encountered where using JDBC has proved not to be the best option is with very large volumes where performance / efficiency is of high importance. Data Studio / JDBC inevitably introduces some overhead, and won't always do things in the most efficient way compared to a bespoke script. Many drivers do provide flexibility around, for example, usage of bulk load protocols, but these aren't always fully implemented.
0 -
@Henry Simms Thanks, those are all really great points. For #2 I am doing similar - sending the 'Passing' and 'Failing' rows to an output Excel for audit purposes. It's a shame the Export (JDBC) step doesn't allow export of the settings, I could have included those too - showing DBMS, Table name, mode etc.
Would still like to hear from fellow Aperture customers on their approach for data updates - pros/cons and any challenges they had.
0