A (slightly edited) summary of the HipChat thread that followed is:
@jbosch
It’s not well documented anywhere, but the extra extensions beyond the first three are not really intended for use by FITS readers beyond our own code.
@RHL
The answer is, “Use the butler” – then you never need know.
@jbosch
They’re essentially an opaque blob that contains the PSF model, a potentially better representation of the WCS, and a list of all input images if the image is a coadd. They may contain even more things in the future.
@RHL
Naturally you really want to use external code. The first 3 HDUs are as for MaskedImages and that’s probably documented (I’ll look it up). The others are things like WCS and PSF and Jim will reply…
@ktl
The first 3 HDUs are https://lsst-web.ncsa.illinois.edu/doxygen/x_masterDoxyDoc/afw_sec_image_i_o.html
@jbosch
Also, what’s in those extra extensions can change based on how the stack is configured (which PSF modeling code you use, for instance), but they do contain enough information for the stack to know what’s in; they’re self-describing in a very limited machine-readable sense, just not in any sort of human-readable sense.
@RHL
The corresponding SDSS files were the psField files. I provided a standalone C library that could be bound to e.g. IDL (it was a long time ago).
@ktl
If you need to understand this in any more detail, I think https://lsst-web.ncsa.illinois.edu/doxygen/x_masterDoxyDoc/classlsst_1_1afw_1_1image_1_1_exposure_info.html#aae90fe3f6a67c6a3ee9d68daf356c113
@RHL
This should be on discourse. I don’t think that the current situation is acceptable in the medium term – not as bad as boost persistence but along the same lines. We need to be able to tell external users how to read our datasets without installing the full LSST stack. Some of this may be simplified (analogous to writing FITS WCS as well as a full layered-distortion model), some may be via standalone binaries (e.g. returning the PSF at a point). But I don’t think, “Use the butler” is going to be acceptable.
@jbosch
I actually think the on-disk formats are not too bad in terms of making it easy to write a stand-alone reader. The challenge for writing a stand-alone reader is, by far, just writing a stand-alone Psf or Wcs class to load into.
@timj
I actually think the on-disk formats are not too bad in terms of making it easy to write a stand-alone reader. The challenge for writing a stand-alone reader is, by far, just writing a stand-alone Psf or Wcs class to load into.
@jbosch
I have to admit I’m not terribly concerned about this problem. For at least Python and C++ users I think we should address it by trying to lower the burden of installing the individual components of the stack (which was never an option for SDSS), and I don’t really care that much about IDL users at this point (I’m sort of hoping they just go away as an important population by operations).
@mwv
A individual file being useful free from an ecosystem is always useful. But, more to the point of @nidever 's question , it’s not about providing wrappers in IDL It’s finding the documentation to understand what is in the file extension.
@nidever
Yes, exactly. You can’t expect all users to only use the stack and the butler. If the documentation is there then they can figure out how to read/load the data.
Reading the extensions from IDL is trivial but I need to know what I’m looking at.
What I mean is that there are IDL tools for easily reading images/binary tables from FITS into IDL arrays/structures. But then I need to know what I’m looking at and how to use it.
@nidever
In SDSS there’s been a tradition of having a “data model” that tells you the names, directory structure and file format of all the input/outputs of a pipeline. This has been extremely useful.
@RHL
It doesn’t work for psField files – the code is too complex and you can put anything in FITS