The process of generating the Gaia DR2 reference catalogs has left me with some open questions about how we produce our refcats and what goes into them. The DR2 refcat I made should be useable for astrometric (but not photometric) fitting. However, there are ways it could be made more useful, and there are components of our refcat system that are currently unused.
-
Our catalog normalization process (“ingest”) requires certain things to be true about the catalog we’re ingesting from, which may not always be true. For example, we get our fluxes by converting from a magnitude column and we assume that a magnitude error field is also provided. Gaia does not provide magnitude errors, because “…the error distribution is only symmetric in flux space.” The Astronomers’ reliance on passing around magnitudes strikes again!
-
We have
thing_flag
columns in our refcats but we mostly do not use them and have not specified what they mean other than “don’t trustthing
“. Thus, I didn’t implementparallax_flag
orflux_flag
for Gaia, because I didn’t know how to choose what “trustworthiness” meant. Converting each external catalog’s idiosyncratic definitions of “good/bad” fields would require some specialized sub-classing of our Ingester, which didn’t seem worth it in this case. And the work I’ve done to parallelize the ingestion process may have made such future subclassing more difficult. -
The external catalogs may provide other fields that are useful: we have a way to just send them straight over, but that doesn’t mean anyone can use them (if
foo
is only in the gaia refcat, you can’t write code that assumes it will be in every refcat). Sometimes those fields do contain information we need, but in a way that again requires a specialized subclass. For example, Gaia has a field for whether a source is variable, but it’s a trinary string field, and thus not trivial to convert into our boolean “is_variable” flag.
In generating the Gaia dr2 refcat, I decided to take the expedient route and just get something done that we could use. I think we’re probably going to have to wrestle with these questions before too long, and I wanted to at least get them written down so they are not forgotten.
I almost feel like we should rope a few people (Jim, Yusra, Paul at least?) together for an hour or so at the LSST August meeting to hash out an approach?