Impact of snapshots on memory
Nigel Light Contributor
I am using a job that has several input snapshots defined in it, though I only connect 2 of them each time I run the job. it seems to be using a lot of storage (and slowing Aperture down) - is this correct? Something to do with caching?
I am thinking, keep the number of snapshots in a job to a minimum. Does this apply with input data sources too and it is best to delete these rather than just leaving them to 'hang around'?
Also, how do I go about deleting snapshots when they are no longer required - can anybody advise?
Thanks - and Happy Christmas
Hey! You will be signed out in 60 seconds due to inactivity. Click here to continue using the site.
@Ian Thornton @Ian Hayden is this something you could help Nige with?
Merry Xmas to you and the team Nige!!
Thanks @Tanj Jagpal
Nb I found where to delete Snapshots - cunningly hidden under the Monitoring icon
@Adrian Westlake maybe something to consider in v2 (if not already!)
@Nigel Light here is the documentation for snapshots, as you rightly discovered you can delete snapshots, more importantly you can limit the number of snapshots that Data Studio stores.
Seeing the vitals of snapshots is visible under monitoring and deletable there
Thanks @Clinton Jones
I'm guessing there is no way of knowing, when developing a subsequent job, what version of the snapshot you are actually reading?
(these are not dynamic ie running a follow-on step would select the most recent snapshot at the time the job was created and if the feeder step had been subsequently reran you would not get the latest version)
Would be useful if this could be incorporated in v2 - either the option to make this dynamic or some indication of the version; particularly during the development stage when this might be fluid.
@Nigel Light there are three snapshot workflow steps
the first one is obvious, it is the action of saving a snapshot
The second is also obvious, it is the latest snapshot that you have of that particular snapshot range
The third is perhaps a little less obvious if you haven't played with it, '0' will show all the known content for a given snapshot, 2 will give the latest two
It gives you the options to see the latest snapshot or the accumulated results of a number of snapshots and time box some of the output.
as an example with the value '2'
as compared with '0' meaning all snapshot rows
here you see i have 4 runs of 22 rows each
Doesn't 'Latest Snapshot' take the snapshot from job 1 at the time of job 2 being created.
So, if Job 1 is subsequently reran the developer has to reattach the 'Latest Snapshot' node to pick up the output from the most recent run of Job 1?
It sounds like I'll have to play with 'Use Snapshot Range' to obtain the most recent run of Job 1 every time I run Job 2
latest snapshot is the last snapshot generated.
If you have a job scheduled to run at noon everday and it takes say 10 minutes then at 12:01 while the job is currently active/running then the snapshot has not yet been finalized, so you will see yesterday's snapshot
This is also why it is better not to have snapshot dependencies in parallel flows in the same workflow
You should always use a separate workflow or cascade the second piece of logic from the output node of the snapshot in your workflow
I'm using separate workflows - but last snapshot seems to be the last one at time of running and isn't dynamic ie if job1 was subsequently reran, it won't be picked up in job 2 without deleting/re-adding it to the job
If I use snapshot range and generate more snapshots, will this create a higher snapshot number each time? So, if I want the most recent I will need to amend the snapshot number?
Off now - mince pies and turkey are calling. Catch up on the other side... and hope you and all readers have a good Christmas
this sounds like a timing issue - how soon after the run of workflows#1 are you running workflow#2 ?
effectively the snapshots are stored with all the rows in a run having a date and timestamp although i believe a number is stored somewhere in the repository.
You shouldn't have to mess with numbers generally.
Using the latest snapshot step should give you the snapshot that was generated the last time the generating workflow ran.
If it isn't then there is something else going on...
Have a good break and let's pick this one up in the New Year!
Thanks @Clinton Jones - this was my understanding but this doesn't seem to be our experience. Nige
@Nigel Light I would recommend opening a support case to get it investigated further