Handling data from Experian Aperture Data Studio jobs

Options
Shreya
Shreya Member
edited December 2023 in General

We are getting error in the execution of Experian scheduled jobs as

"There is not enough space on the disk"

We have installed the data from the Experian Aperture Data Studio on the E Drive (511 GB) out of which only 19 GB is remaining. Majority of the data is occupied in the E:\ApertureDataStudio\data\resource folder under E:\ApertureDataStudio\data\resource\index (227 GB) and E:\ApertureDataStudio\data\resource\source (119 GB) folders.

Could someone guide on how to handle the data on these folders and storage of Experian data on drives in general?
Please find the Experian software versions being used
Experian Aperture Dat Studio - 2.10.10.157 installed on Azure VM
Find Duplicates (remote) -3.8.15 installed on separate Azure VM

Best Answer

  • Ian Hayden
    Ian Hayden Experian Super Contributor
    Answer ✓
    Options

    Hi Shreya,

    The resource/source folder contains the data you have loaded into Data Studio, i.e. everything you see under Datasets in the UI. If space is a continuing issue you can turn on compression (under Settings→Performance→Compression) for a small trade off in execution time (for new datasets only). Larger unused datasets or old batches can also be removed to free up space.

    Everything under resource/index is temporary files used when executing workflows or other operations under explore. Each different step has different requirements for what they need to store whilst executing, and these temporary files are retained in case you need to drill into the workflow step again (making it much faster second time). Again if space is an issue you can change the amount of time these temporary files are retained for (in Settings→Performance→Index Expiry Time), although this requires occasional Data Studio restarts to take effect.

    It is highly recommended that you manage your disk space to prevent it from running out. Since it is dependent on the volumes of data you are loading and the number, type and frequency of workflow executions it is difficult to give specific guidelines outside of that.

    In any case 19Gb does seem on the low side. If the steps above do not give you back much storage, please get in touch and we can advise further.

    Regards,

    Ian

Answers