Error attempting to delete match store
Hi,
In my project, we are using Experian Aperture Data Studio(2.10.10) to connect to a Remote Find Duplicate match store(3.8.15) over http.
The Find Duplicate step is configured with the 'Clear and re-establish store' checked as part of our requirements. Majority of the time ,the process runs end to end successfully, however sometimes we get error in the Find Duplicate step as below where deleting the match store fails
2023-05-26 03:47:44,947 ERROR c.e.d.m.MatchApiImpl [workpool-server-fixmem-executor-closer-862] Error attempting to delete match store:
com.experian.match.api.client.ApiException: {"matchStoreId":"Dupstore_Deduplication_RemoteVM","matchStoreLocation":"E:\ApertureDataStudio\data\experianmatch","state":"FAILED","progress":100.0,"message":"Find Duplicates step failed. java.util.concurrent.TimeoutException: Futures timed out after [120 seconds]","createTime":"2023-05-25T03:12:05.465109500","startTime":"2023-05-25T03:13:33.728723900","finishTime":"2023-05-26T03:47:45.139292"}
As a resolution, currently we are restarting the remote Find Duplicate services and manually clearing out the match store from Data studio to confirm that clearing the duplicate store is working as expected. This process albeit manual works for now.
Could someone let me know if there is a permanent fix for this issue? Or how to resolve this issue so that a manual intervention wont be required in future?
Answers
-
Hi Shreya
Our dev team think this error is unexpected. If you run into it again could you please share your Find duplicates log file (and possibly also your Data Studio log file also). The easiest way to do this is to contact our Support team (who can also help you or your administrator to locate these files if needed)
1 -
We have upgraded Experian Aperture Data studio(2.10 to 2.13.6) and Duplicate server (remote)(3.8.15 to 3.10.2) but we still do get recurring errors while deleting the match store as a pre step before running the Find Duplicates Steps.
On checking the duplicate store, we get the error as duplicate store could not be found.
To resolve this, currently we are restarting the remote Find Duplicate services and manually refreshing the match store from Data studio to confirm that connectivity to the duplicate store is working as expected and rerunning the workflow which works.
Could someone let me know if there is a permanent fix for this issue? Or how to resolve this issue so that a manual intervention won't be required in future? PFA the error screenshot for error details.
1 -
Hi Shreya, we will need to look into this as we haven't heard this to be a common issue. Could you please contact support first so they can try to reproduce and gather logs etc. as we will need more information. You can find regional support contacts via the same link that Josh posted above. Thanks.
1 -
Hi team,
I am running into the same issue. Have tried re-creating the duplicate store and the job runs fine a few times and then fails and it seems to not be able to locate the duplicate store. Re-starting the duplicate server service fixes it but we don't have a fix from product team yet. Still working with support on it.
Our duplicate stores are sitting on D: drive of the find duplicates server, thats the only non-standard thing I could think of.
1 -
@SCH what version of Data Studio and Find duplicates are you using?
It was investigated that this occurs to stores that have had problems during creation, either from running low on resources or a job being stopped, but firstly you might want to upgrade to the latest versions and also keep an eye on resources running low.
1 -
We are on the latest versions:
ADQ : 2.13.8.173
FDS : 3.10.5
0 -
Thanks for confirming let me see if I can find the Support case you raised .
Some info on system resourcing/requirements that might be helpful:
1 -
@Josh Boxer : I have confirmed that our separate instance is installed with the 16Vcpu and 128GB of Ram. I checked your link and want to add the JVM RAM parameters, could you confirm how to do it.
This is the existing entry in find duplicates.ini file:
Virtual Machine Parameters=-Dlogging.config=log4j2.xml -Dspring.config.name=find_duplicates -Dmatch.database.path.windows="D:\ApertureDataStudio\data\experianmatch" -Dserver.port="8080" -Dmatch.maximum.cluster.size="500" -Dmatch.standardize.url="http://localhost:5000"
Would this need to change to :
Virtual Machine Parameters=-Dlogging.config=log4j2.xml -Dspring.config.name=find_duplicates -Dmatch.database.path.windows="D:\ApertureDataStudio\data\experianmatch" -Dserver.port="8080" -Dmatch.maximum.cluster.size="500" -Dmatch.standardize.url="http://localhost:5000" -Xms32g -Xmx64g0 -
hi,
Below are the VM configs that we are using and the jvm parameters ( highlighted jvm properties were added upon earlier connects with product support team)
find duplicates (remote)
memory config : 32GB RAM and 600GB HDD
Virtual Machine Parameters=-Dlogging.config=log4j2.xml -
Dspring.config.name=find_duplicates -Dmatch.database.path.windows="E:\
ApertureDataStudio\data\experianmatch" -Dserver.port="8443" -
Dmatch.maximum.cluster.size="500" -Dmatch.standardize.url="http://localhost:5000" -
Xms8g -Xmx20g
Experian aperture Data studio servermemory config : 32GB RAM and 600GB HDD
Virtual Machine Parameters= -Djava.locale.providers=CLDR,COMPAT -
Djavax.net.ssl.trustStore="E:\ApertureDataStudio\certificates\cacerts" -XX:+UseG1GC
-XX:+UseStringDeduplication -XX:+HeapDumpOnOutOfMemoryError -Xmx20GPlease note on both the server we are not running any other application except the Experian ones. Please let me know if any config change is required.
0