Hi all,
I have a question regarding the use of ap_pipe
on a large number of visits.
I am using HSC PDR2 data (I already have calexps and difference images, so I am only performing association, thanks to --reuse-outputs-from differencer
).
I created templates for each band, and I am running ap_pipe
on a large number of visits, using 16 cores at CC-IN2P3 with -j 16
.
Some of my jobs ran ok, for instance one with 24 visits, but some failed.
I have a job with 47 visits that just crashed, with slurm giving me some insight:
maxvmem: 137.644G
maxrss: 28.874G
maxpss: 26.948G
which is over what I am access to.
Beeing an “alert” pipeline, I guess ap_pipe
was designed to run on single nights, so not that many visits over months?
How different would it be to run ap_pipe
:
- visit by visit or night by night (where I would wait for each job to be finished)
- visit by visit or night by night all in parallel
- with a all the visits provided to the command line the way I do it now?
I guess my question is: can I split my ap_pipe
runs by nights and can I run them in parallel, or would that somehow break the “association” ?
Best,
Ben
PS: I think this is related to the questions in this posting : Ap_pipe in parallel jobs, but the answer seemed non conclusive to me, so I just want to check again.