Unable to export huge records from Aperture tonDatabricks

Hi Team,

We are attempting to export 10,000 records from Aperture Data Studio to Databricks using a JDBC external system, but the process is taking 6 to 8 hours to complete.

Experian Support suggested exporting the data from Aperture to a Databricks storage container in CSV format. To do this, we would need a service principal to connect from Aperture to the Azure storage container. However, the service principal capability is not yet available in the current version of Aperture.

Could you please advise if there are any alternative options for connecting Aperture Data Studio to the Databricks storage container?

Regards

Uma

Comments

  • Hayden, Ian
    Hayden, Ian Experian Super Contributor

    Hi Uma, there are a few things you can try, the most obvious is increasing the MaxBatchSize to a larger value. If this is still slow, it is worth enabling extended logging to see where the time is being spent. Details of all this is on the official documentation page here:

    Supported connection properties | Databricks on AWS