-
JDBC Performance issues can simply be about security
In Data Studio you have a number of different ways of accessing data. File Based methods Data connection based methods (JDBC or special connector) API or programmatically Leveraging snapshot data JDBC is a high performance way of getting data into Data Studio fast! Getting JDBC up and running is straightforward and…
-
Data Wrangling - is it so bad?
An interesting perspective from Pete Aven of Marklogic, popped up in my feed this week, written on Medium.com and enitled "Data Wrangling is Bad"; Aven describes how potentially we're all Data Wranglers, that it is not a good thing, should not be embraced or accepted. In reality though, do we have a choice?
-
Why might my Find Duplicates results look different?
If I use the Find Duplicates step on its own, in some instances I get more clusters of records (clusterIDs) than if I use Data Studio with the Address matching step. I attach an example that illustrates this using the test data delivered with the application Why would that be?
-
Do businesses run on premium data? New study assesses variables in data quality tools
Lisa Ehrlinger from Johannes Kepler Universität Linz Linz, Austria, and her team have identified 667 data quality tools on the market, and they have narrowed that number down to 13 for detailed testing and analysis based on their domain independence, non-specificity, and availability free or on a trial basis. While the…