AWS S3 Dropzone

Mirjam Schuke · September 2024

A Data Studio user had the following question that might be relevant for the community.

Is someone able to elaborate on how an Aperture dropzone on an AWS s3 bucket works?

I have a dataset setup as a dropzone and it correctly picks up new files of a different name when added to the s3. However, if I replace a file with a new version of it in s3, the dropzone doesn't detect the new file. If I delete the file from s3 and reupload a new version of it, the dropzone doesn't detect it.

If i delete the batch in the dataset, and then reupload a new version of the file, it still doesn't detect it.

Does the dropzone literally just look for filenames it hasn't seen before? So if new versions of a file are uploaded, they won't be pulled through? If that's correct, this seems like a bit of a limitation, and feel like it should be using the timestamp metadata on the file in s3 to detect a new file.

Thanks

Mirjam Schuke · September 2024

Enable this box "Enable external system dropzone" of that dataset (Edit Details under Actions) to monitor new versions of that dataset and include a file pattern which starts with a generic part of filename to pick new versions of that file....

The documentation highlights: "the processed file is tracked in the server based on the file name and last modified date, to ensure it is not loaded again unless modified." https://docs.experianaperture.io/data-quality/aperture-data-studio-v2/get-started/configure-external-systems/#cloud-storage-and-sftp

AWS S3 Dropzone

Answers

Categories