JDBC Performance issues can simply be about security

Clinton Jones
Clinton Jones Experian Elite
edited December 2023 in General

In Data Studio you have a number of different ways of accessing data.

  • File Based methods
  • Data connection based methods (JDBC or special connector)
  • API or programmatically
  • Leveraging snapshot data

JDBC is a high performance way of getting data into Data Studio fast!

Getting JDBC up and running is straightforward and generaly requires very little in the way of additional configuration except in the case of custom or 3rd party JDBC drivers which you may choose to explicitly add.

If your JDBC loads are taking a long time, experience tells us that this could stem from a number of different reasons.

A common reason would be that your Data Studio installation is simply under-powered. You may be using conventional disk or SAN storage and you might not have enough resources on the machine. Check out the recommendations that we have for system requirements if you suspect that this is the case

A second reason can be that the distance (network hops and latency) between your Aperture Data Studio instance and the source of the data, is many hops or the connection is has high latency. There are a number of ways you can test that latency, including running a traceroute and doing ping tests with a payload. Less hops and less latency is better.

A third reason could be that the system you are connecting to, has some problem. There are many reasons that potentially contribute to this. In combination, these include (but are not limited to):

  • Database structures (tables vs view), table indexes and database statistics
  • Size of the database and concomitant number of tables, views etc
  • Number of active users and RDBMS workload
  • Security and permissions

Resolving issues on your database are tricky because if it is not your database and you are not the database administrator then you have to depend on others to help you identify and resolve performance bottlenecks.

A problem that @Henry Simms and @zydrond and I encountered this past week related to security and permissions actually having a negative impact on performance of a SAP HANA database. On face value I would typically say that good security on a database would actually prevent you from connecting to or accessing particular tables. The reality though, is that database and in particular, application database permissions can be complex. This is one of the reasons why application vendors like Oracle and SAP actually don't like you accessing tables directly. This potentially compromises the permissions structures that applications have in place.

A clue in this instance, was that the JDBC connection did not allow the user to see the tables and schema of a particular table that we were interested in - MARA - however in the JDBC driver configuration you do have the ability to specify tables or table patterns and in this case, because we knew this was the table we were interested in, we forced the RDBMS to yield that table and its contents.

For some strange reason, that particular credential that we were using gave us some access and in fact enough access to start pulling down records, albeit slowly. A switch to a more appropriate credential with better/greater permissions ultimately resolved the problem but what this tells us is that apart from the usual JDBC traces you might aso want to consider security traces in order to identify the root cause on performance bottlenecks with JDBC connections.

If you have a similar experience, please do share it!

Tagged: