I am working on migrating the existing pipetask
command to use the same command line framework as the recently-introduced butler
command, and would appreciate feedback on an interface change described below:
We are considering using a feature that allows pipetask
subcommands to be “chained” together. This allows the products of one subcommand to be passed to the next subcommand when they are called in the same command execution. For example in the following code, the build
subcommand creates a pipeline
instance which is passed to the qgraph
subcommand, which then uses that pipeline
when building a quantum graph (and then saves the quantum graph to a file called qgraph_file
).
pipetask build -p ci_hsc_gen3/pipelines/CiHsc.yaml \
qgraph -d "patch = 69" \
-b ci_hsc_gen3/DATA/butler.yaml \
--input $INPUTCOLL \
-o $COLLECTION \
--save-qgraph qgraph_file
This differs from the existing pipetask
command, where the qgraph
subcommand accepts all the options for the build
subcommand and implicitly executes build
before qgraph
. (Similarly in the existing pipetask
command, the run
subcommand accepts options for the build
and qgraph
subcommands, and executes build
and qgraph
before run
.) The same call as above with the existing interface would be:
pipetask qgraph -p ci_hsc_gen3/pipelines/CiHsc.yaml \
-d "patch = 69"
-b ci_hsc_gen3/DATA/butler.yaml \
--input "$INPUTCOLL" \
-o "$COLLECTION" \
--save-qgraph "$QGRAPH_FILE"
While the example interface difference is small, with chained subcommands, calling one subcommand does not imply other subcommands. This accomplishes:
- Separates concerns of each subcommand.
- Allows the possibility of alternative subcommand implementations.
- Allows other subcommands to be created, and they can be called between the existing subcommands if appropriate.
However, there are some tradeoffs:
-
The interface is somewhat more verbose and specific:
If you want to build a quantum graph from a new pipeline, you must call both subcommands with each subcommand’s options after it:pipetask build <build options> qgraph <qgraph options>
, whereas before only one subcommand had to be called and all the options could be passed after it:pipetask qgraph <build options and qgraph options>
. -
You can’t see help for all subcommands at once:
Help for each of the subcommands is viewed separately;pipetask build -h
,pipetask qgraph -h
,pipetask run -h
for help on each of the three existing subcommands. With the existingpipetask
implementation,pipetask run -h
would show options forbuild
,qgraph
, andrun
. -
Some options are repeated:
In some cases, two or more subcommands may have options in common. For exampleqgraph
andrun
both accept options for configuring a butler. At best it’s annoying to have to enter the same option values for two subcommands (pipetask build -p pipeline.yaml qgraph <butler options> run <same butler options>
). I implemented a possible solution called “option forwarding” where some of a subcommand’s options may “forward” to the next subcommand. You can see which options forward from a subcommand in its help output, they are indicated by the text “(f)
” after the option’s metavariable information. Forwarded options may be overridden by passing that option to the next subcommand on the command line.
Both versions of the pipetask
command are available for testing and experimenting. The new interface is implemented in a command called pipetask2
, and the original pipetask
command is not changed. Changes were checked in this morning so until the next weekly you will need at least today’s version of daf_butler
, obs_base
, and ctrl_mpexec
.
I would like to know:
- If you have a preference for one interface or the other.
- If you have questions or concerns about the chained subcommand interface or implementation.
- If you have ideas for improvement or alternative approaches.
- (anything else you’d like to say about it)
Thanks!