I’m working with @afausti and the rest of @SQuaRE on the SQuaSH QA system. A compelling use-case for QA is to provide developers with feedback on how their commits affect science quality metrics in much the same way as the CI system validates the code itself.
To make this great, we’ll want to tie each QA run (e.g., validate_drp
running on the output of processCCD
) to the exact version composition of the Stack. This way we could compare metric time series for 'master
’ against ticket branch development. Going further, we’ll want to tie QA runs to individual commits on a ticket branch so that a developer can understand exactly how their code changes influence science metrics.
Rather than capturing provenance ourselves, we should certainly adopt LSST’s provenance database. @jbecla has posted the provenance design at https://github.com/lsst-dm/provenance_proto/blob/tickets/DM-3962/Provenance.md however I don’t fully understand how software provenance works here.
My questions/needs:
- How is Stack software provenance stored? We’d like to know:
- Package names
- GitHub URLs for the authoritative package repositories
- Git commit of each software repository used in the stack
- The branch name that commit resolves to (although that’s easy to compute)
- Is the provenance system such that every QA run will be registered in the provenance DB?
- We’ll need an identifier to tie the QA run (a Jenkins job number) to records in the QA DB.
For completeness, these discussions are related: