We are preparing to conduct pipeline runs and ingest the results into Qserv. For the purposes of this question this should be similar to a standard run producing measurement and forced catalogues from coadded images. Currently, we are using the Butler to retrieve catalogues and photometric calibrations. We then use the lsst.afw.table.BaseCatalog.asAstropy function to get the full table which we modify in various ways before writing to CSV for ingesting to MySQL. You can see a rough test script here.
Firstly, is this basic procedure fundamentally wrong? That is, am I missing some already available means to turn Butler products into Qserv and/or MySQL ingestible CSV, TSV, or Parquet files?
Secondly, if the answer to the previous question is no, what key features should the resultant CSV tables have? We are making various decisions regarding 32 vs 64 bit floats, maximum column name lengths, whether to duplicate repeat columns for each band, whether to add new columns to facilitate useful joins, and whether to create additional metadata tables to aid queries etc. If the asAstropy function is not optimised for creating final ingestible files what are the key things which we should modify?
Finally, we are unsure how Qserv is distributing blocks according to id
or sky position. We want to make sure our new tables will be sensibly distributed under the same system to optimise queries. These new tables will contain the public LSST objects in addition to new detections we are merging with the mergeCoaddDetections task. Should this point be an important consideration when making the tables for ingestion? We are concerned that Qserv might distribute the LSST detected objects in different blocks to the same objects in the main LSST tables such that joins on those ids are not efficient.
We have progressed since the previous questions on this topic that we asked here and here and would appreciate any information about future plans or recent development on this front. We are making and ingesting CSVs successfully but want to make sure the final tables are as close to the main public Qserv ones as possible.
Many thanks,
Raphael.