Aperture REST API calls as Datasets within Aperture

Dom Gittins
Dom Gittins Member
edited December 2023 in Tips and tricks

Data is available from Aperture via the REST API that is not available in Aperture. Full details of Spaces, User Groups, Users, Jobs, Datasets etc.

We would like to use Aperture to manage Aperture.

For example: An automated workflow which checks for long running jobs and notifies us of issues; a regular review of user activity which allows us to remove access from inactive users, producing data quality stats which are already enriched with Space metadata and which automatically update when a new Space is created.

We can see that there is already a JDBC driver which ought to be able to do this (Autonomous REST 6.0) but the configuration is highly complex. Ideally we would have a config file that sets up all GET calls from the API and makes them available as datasets which we can then use in our Data Quality Management space (and activities).

Has anyone configured the Autonomous REST driver to access the Aperture API ?

If not, could Experian provide a documented example setup including a config file we can complete with the details of our own server ?

Tagged:

Best Answer

«1

Answers

  • Henry Simms
    Henry Simms Administrator

    Some ideas for things to could do with this Data Studio metadata.

    1. Monitoring and email alerting on unusual login activity via the Events table
    2. Create dashboards showing all users' admin access, as a way of checking that permissions aren't too broad
    3. Analysis of Job execution times - checking and alerting if certain jobs are running in parallel.
    4. Finding where your large Datasets are (based on number of rows & columns) that may be hogging disk space
    5. Add in the Metrics operation endpoint and create a dashboard to monitor things like CPU usage, or page responsiveness
  • Danny Roden
    Danny Roden Administrator

    Great question @Dom Gittins (and nice response @Henry Simms).

    I've just given this a go myself and it worked a treat.

    Just as an FYI for anyone else giving this a go I found that the file you attached downloaded as a .txt (so it needed renaming to get rid of this extension).

    I'm currently using this approach to help @Sharat Kutty generate a list of all datasets within a 'master data' space and expose this as a list for other users to see what 'Enterprise Data' is available (i.e. the catalogue of data that they can request access to, via a shared View). What's nice about this is that you can pull through all of the metadata e.g. source type, column name, datatype, dataset rowcount, summary/description fields and more. Nice for encouraging consistent use (in line with any defined best practises) of the software

  • Henry Simms
    Henry Simms Administrator
    edited January 2023

    A couple of people have hit a problem where, after setting up the REST API connection, loaded tables from the new AutoREST system have 0 rows. That's accompanied by an error in the log file along the lines of:

    java.sql.SQLException: [ApertureDataStudio][AutoREST JDBC Driver][AutoREST]I/O Error com.experian.aperturedatastudio.autorest.util.aj: SSL handshake failed: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target

    This is a site certificate error, and essentially indicates that Data Studio doesn't trust the API calls it's making to itself.

    With this set-up, Data Studio is acting like a client to its own API, but doesn't have its own certificate in its truststore - only it's key store. That's quite likely to happen if you're using a certificate from a private CA.

    The fix is to add the public certificate from Data Studio into the cacerts truststore it uses. Steps:

    1.     Get the public certificate. You can grab this from the browser and download to a .cer file or similar

    2.     Run the following command from the command line to import them into the Aperture Data Studio truststore:

    "<JRE path>\bin\keytool" -import -trustcacerts -alias <alias> -file "<Your own Certificate>.crt" -keystore "<database path>\certificates\cacerts"

    Where:

    • <JRE path> is C:\Program Files\Experian\Aperture Data Studio 2.9.6\java64\jre, where you'll need to modify for your own version number and drive letter, if not C:\
    • <alias> is anything you want
    • <Your own Certificate> the full path to your public key file
    • <database path> = D:\aperturedatastudio, again you may be using a different drive letter


  • Josh Boxer
    Josh Boxer Administrator

    Love this! For anyone downloading Jobs API, the start and end times are epoch time, which is "the number of seconds that have elapsed since January 1, 1970" whilst fairly standard for API, not useful for a human, so this Function can be used to transform:


  • Danny Roden
    Danny Roden Administrator

    I've also just added a new reusable function to the Community Function library for this :) https://community.experianaperture.io/discussion/971/convert-epoch-date-time-to-standard-date-time/

  • Henry Simms
    Henry Simms Administrator
    edited May 2023

    Update that I discovered recently: If you don't want to expose the API key used by the External System in the Data Studio UI, you can add it to the .rest config file instead. So rather than:

    you can add the following lines to the top of your .rest file:

    "#options": {
    "authenticationmethod": {
    "choices": "HTTPHeader",
    "default": "HTTPHeader"
    }
    },

    and then for each table add:

    "#headers": {
    "Authorization": "Default SzSkkIW0t9VJMLpEqGd..<etc>",
    "Content-Type": "application/json"
    },

    The file should look something like this:

    Any headers can be passed in this way

  • @Henry Simms could we use this process to connect with an external system such as Zendesk? the idea is to download the data from Zendesk and upload it within a Aperture Space.

    Many thanks in advance for your support.

  • Henry Simms
    Henry Simms Administrator

    Hi @Marco_13112001 - Yes, it should certainly be possible to use the Autonomous REST connector to pull data from the Zendesk REST API. To get started I would recommend:

    1. Using Postman or a similar tool, ensure that you can use the API directly to do the things you need, and have the correct auth credentials etc
    2. Run the driver's Configuration Manager from the commandline to begin configuring the connection info you'll need when using the AutoREST driver in Data Studio
      1. Run "C:\Program Files\Experian\Aperture Data Studio 2.11.3\java64\bin\java.exe" -jar DSautorest.jar. (you will likely need to update your path to the java.exe used by your version of Data Studio)
      2. By default, running the jar will pop up the "Configuration Manager" (jdbc.html) page for the driver. To create the custom model (mappings, auth etc) for your API using the "Composer"(arc.html) page to generate a .rest file, as well as the connection string.
    3. Only when you have access to the normalised tables you need in the AutoREST composer, move the configuration to Aperture Data Studio

  • @Henry Simms Many thanks for your help. I'm using Postman as suggested to develop and test the API (I need to add that I'm still learning about the whole process), however even tough it is returning a 200 response it also brings an warning saying that it is "unable to find local certificate".

    The post Request has the following structure - {{baseurl}}//api/v2/datasets/{space_id}/create?token={Aperture API key}. The Aperture key was setup to enable the creation of a dataset.

    Am I using the wrong construct or it is an authorisation issue?

    Many thanks thanks in advance.

  • Henry Simms
    Henry Simms Administrator

    Hi @Marco_13112001 , looks like there's a few things going on here.

    1. I can see it looks like you're looking at the Data Studio REST API. You won't need this to bring data from your ZenDesk API into Data Studio, because once you've got the External System set up (connected to Zendesk, with the driver's .rest file configured to normalise the responses you need into tables), you'll just be able to bring in the data like you do with any other Dataset, via the UI. You'll then be able to set up auto-refresh, also as with any other External System source dataset:
    2. If you do want to use the Data Studio REST API, there's a few changes to make to the way you're calling it in Postman:
      1. Move the authorization to the Header, and make sure you include the environment label ("Default" is the default) before the API key:
      2. If your Data Studio instance uses an SSL certificate from a private CA, or uses a self-signed certificate, you may need to add your certificate to Postman.

  • Henry Simms
    Henry Simms Administrator

    We recently needed to load data from the Exportable Objects endpoint of the Aperture API, so we could analyse dependencies between objects in bulk.

    This is a per-space call (the space id is passed in the call), and the below snippet shows what you would need to add to your .rest file to bring back exportable objects from Space 6:

    "exportableobjectsspace6": {
    "#path": [
    "/api/1/exportableobjects/6 /objectLists"
    ],
    "objectType": "VarChar(64),#key",
    "objects[]": {
    "id": "Integer",
    "name": "VarChar(76)",
    "summary": "VarChar(241)",
    "status": "VarChar(13)",
    "dependsOn<exportableobjectsspace6_dependsOn>[]": {
    "objectType": "VarChar(64)",
    "id": "Integer",
    "name": "VarChar(64)"
    },
    "forcedDependencies<exportableobjectsspace6_forcedDependencies>[]": {
    "objectType": "VarChar(21)",
    "id": "Integer",
    "name": "VarChar(40)"
    }
    }
    }

    This returns 3 related tables:

    The first table contains the objects in the space, the second and third tables can be joined on (by OBJECTTYPE and POSITION) to view dependencies and forced dependencies

  • @Henry Simms I have used the attached .RESt file above and its working for me. Many Thanks for that. But it does not have all the available APIs. I was looking for the below ones as well, to check the Performance of our servers

    I tried to get the through progress DataDirect using the Rest you shared as the existing model, but getting some Auth issue there.

  • Henry Simms
    Henry Simms Administrator

    Hi @Mahulima , I've updated the datastudio_api.rest file in my original post to include tables for:

    • metrics. This will generate a single 486-column table containing all available metrics from the Aperture API:
    • metrics/disc.metric. This specifically extracts the disk metric information from the Aperture API, and maps as multiple relational tables which would need to be joined:

    Different metric names (call metrics/list for the list) will have different mappings.

  • @Henry Simms Thank you so much! I tried and its working perfectly :)

  • Mahulima
    Mahulima Member
    edited December 2023

    I am facing one issue with Sessions API. When I am getting the response of the session it is showing me the start date as 28th Nov for me and for some regular active users and end date is null. I thought when we log off from the Aperture the session will be ended, but its not happening. Idea was to get active users for past 24 hrs. which I am not getting with Users API as well, as there it is only showing Last login as a date not as timestamp. But there getting the correct Information. Can you please help here. @Henry Simms

  • Henry Simms
    Henry Simms Administrator

    Hi @Mahulima - I can see the same behaviour, but can confirm that the problem isn't with the API, as you also see null enddate values in the repository database itself. I'll try to find out more about why some session records do not store end dates

  • Hi @Henry Simms , Is there a way we can get the details of the View as well, Currently from the attached rest file we are getting the dataset details but it only contains datasets, view details are missing

  • Henry Simms
    Henry Simms Administrator
    edited December 2023

    HI @Mahulima , the problem really is that the REST API doesn't provide much information about Views at the moment. While for Datasets, you can make a call to get lots of information about all Datasets in a space (e.g. source, refresh data, row count), for Views you can only really get sharing information for a specific View (based on an ID), or carry out operations for automatically sharing or including views.

    If you're looking to list Views in a Space and gather metadata on those views, I think a new REST API endpoint would be needed. I would suggest putting an idea on the Ideas Board

  • @Henry Simms Thanks for your prompt response. I am trying to get the view information along with which all spaces the view has been shared and also when the last time the view has refreshed.