With /datasets now available, DR in the works, the lsst-dev7 transition to full ops coming and NFS retirement in the near future, it is time to finalize the plan to organize camera data for project use, to reduce the new-developer ‘discovery’ curve, encourage / enable use of new capabilities, and all that good stuff.
Background information via RFC 95.
Layout for initial loading
With the help of @hsinfang and @daues, we have identified these candidate datasets for copying into /datasets as defined below. Note, we are proposing ‘copying’ data at this time, not moving, in order to not disrupt current development efforts.
We ask that the owner (or delegate, domain expert, manager, whatever), confirm the destination. NCSA is available for the data copy. Each set will require someone with proper domain knowledge for the butler-ization of each.
(if the coordination here becomes to unwieldy, we will move to Jira)
/datasets/astrometry_net_data/ (source: /lsst7/astrometry_net_data/, owner: @price)
/dataset/decam/data/ (source: /lsst8/decam, owner: @mwv)
/datasets/hsc/commissioning/ (source: /lsst3/HSC/, owner: @price )
/datasets/hsc/newhorizons/ (source: /lsst8/ctslater/nh_data minus _parent, rerun, owner: @ctslater )
/datasets/sdss/preprocessed/dr9/ (source: /lsst7/sdss/dr9/, owner: @mjuric )
/datasets/lsstSim (source: external)
These two are not verification data but I would like to make them equally accessible.
/datasets/all-sky (source: /lsst/all-sky-ASIVA, owner: Mike Fitzgerald)
/datasets/all-sky-ASIVA (source: /lsst/all-sky-ASIVA, owner: Jacques Sebag)
These two snuck in the backdoor of the new home. If this is verification data, please suggest a destination.
/datasets/gaia/ (source: /gpfs/fs0/home/ctslater/gaia_refcat, owner: @ctslater)
/datasets/ (source: /gpfs/fs0/home/fforster/???)
On Immutability
I have a plan for how we can safe-guard and even verify the integrity of these datasets. But that will be revealed in another thread. For this initial loading, we will secure by hand.
Creation / Deletion Policy
RFC-95 touched on the need for a RFC when removing ‘public’ data sets. I believe we need a formal procedure for introducing data sets as well as their layouts. Again, that is coming, likely via RFC.