Preface
Command line tasks are likely to be first point of contact for many astronomers with the LSST Stack. This means that the experience of using Command Line Tasks is extremely important.
If the command line experience is bad, frustrating, or even unpolished, we’re probably going to lose that astronomer forever. Certainly that astronomer will be reticent to invest in learning our Python API if we allow a belief that our entire stack is badly designed and executed.
With that in mind, I wanted to start a conversation about how we can deliver the best command line experience possible. My comments here are mostly agnostic of the actual architecture. I’m focussing entirely on the look and feel of a command line task. In the title I deliberately used the software hipster term ‘UX’ (mean user experience) since I believe that we should treat the medium of the command line with the same revere as tech companies treat iPhone screens or browsers.
We’ll know we’ve succeeded in designing the command line task experience when we can give a demo and hear the audience mutter “whoa, that’s cool!” This is the design we should strive for.
I also want to disclaim two things
- I mean no offence to those whose existing code I might be claiming to be anti-patterns. I just want to help make things better.
- I know these suggestions are outside the scope of the current SuperTask design. I think it’s worth starting this discussion now, though, to ensure that our overall task roadmap takes UX into consideration.
Some issues with tasks
Task names aren’t always coherent
One of the first things that struck me about our command tasks is that they look messy. See the task list in the pipetasks bin directory. And by messy, I mean that the names and verbs of the tasks don’t present a coherent vocabulary. To me, command line tasks look like an after-thought.
Command line vocabulares can be beautiful. For example, the vocabulary for vagrant
vagrant box
vagrant init
vagrant up
vagrant connect
vagrant suspend
With a controlled vocabulary like this, the vagrant
app suddenly looks simple and knowable.
For the stack, it’s unclear what many tasks do from their name alone. dumpTaskMetadata.py is self-described in its own docstring as a tool to
Select images and report which tracts and patches they are in
I would never have guessed that. Unspecific names makes docs harder to read if one can’t find expected keywords while scanning the table of contents.
Task documentation is lacking
Even if the user has found the right task, we have the problem of documentation. We fundamentally need all tasks to be comprehensively documented in task docstrings and rendered to the LSST Stack Handbook.
But even then tasks are challenging to document because they are so configurable. It’s possible for sub-tasks to be redirected. Thus any ‘static’ documentation can be contradicted by task redirection done by the user.
A vision of the command line experience
Here I present a vision of what our command line experience could be like.
The lsst command
When we tell a new user about LSST’s task pipeline we tell them one thing: “check out the lsst app.”
> lsst
All tasks are namespaced into subcommands in the same sense as sprawling command line applications like git
or our example vagrant
from above.
Tasks would then be sub-commands
> lsst process-ccd [args]
Tasks provided with instruments or by community packages would have their own command space
sdss process-ccd [args]
decam process-ccd [args]
megacam process-ccd [args]
You’ll also notice that in such a command architecture, we’ve done away with amateur-looking taskName.py
script names. The lsst task
signature signals to the user: these aren’t cobbled scripts; this is a well-engineered application.
When you run the root command
A user knows that that the LSST pipeline is packaged in the lsst
command. But that user doesn’t really know anything else; let alone how to run the pipeline.
The natural thing is to just type at the command line:
> lsst
When this happens, we help the user! The root command prints out a small help message pointing to the online task documentation. It also goes a step further, and prints out a list of all available command line tasks.
Now without even reading the docs, the user has a list of commands to try.
(Note: to be idiomatic and safe, lsst --help
will do the same thing)
Getting help on running a task
So the user knows the commands, but how are they run and what do they do? The initial help message will the user to try any command with the help
verb, as in:
> lsst help process-ccd
This will show user-oriented task documentation, including a usage example, and a list of arguments and their defaults.
Note that this documentation should include a schematic flow of any subtasks called. The argument list should include arguments associated with the subtasks.
Since the total collection of arguments might be overwhelming, some arguments may be labeled as ‘superuser arguments’ and their defaults often assumed to be correct. By default these superuser arguments would be omitted from the help
printout.
> lsst help --all process-ccd
would reveal them.
Similarly, a user might want to filter the command line help to just the base package or a certain subtask. These commands would help with that
# only arguments for process-ccd itself
> lsst help --base process-ccd
# help for isr,calibrate subtasks.
> lsst help --sub isr,calibrate process-ccd
Graphical task help
The terminal has limited information bandwidth. Instead, we could use the show
verb:
> lsst show process-ccd [args]
This launches a local static web page showing the pipeline, including task help and the values of arguments as currently set on the command line.
This is an improvement on the docs that I can ship with the LSST Stack Docs because these docs will reflect the actual state of a task given the current configuration, including redirected tasks and what arguments have been set to non-default values.
Graphical task composition
The static web server help provided by lsst show process-ccd
was nice, but why settle?
The command
> lsst compose process-ccd
launches a graphical task composer. That is, a local python server is booted up. In this local web app, the user can actually configure and preview the task pipeline.
The user could graphically redirect a subtask to another one and dynamically see the new options that are needed.
The user could also see exactly what data would be processed given Butler data id selectors.
Once the user was satisfied, that pipeline configuration could be exported from the local web app so that the user could immediately run the pipeline in the command line.
Architectural requirements
This discussion is deliberately not about implementation, but rather about experience. Nonetheless, the experience requires these pieces of infrastructure to be implemented:
- There needs to be a task registry that not only LSST stack tasks plug into, but any third-part
obs_
tasks etc plug into as well. This will allow thelsst
command to show a listing of all commands, and forlsst compose
to help a user redirect subtasks by showing tasks available. - Tasks not long exist as command line scripts, but as Python modules that follow a task protocol/API.
- There needs to be an API for tasks to expose their processing task pipeline DAG, as currently configured.
Closing
I’ve designed a command line task architecture not by considering the implementation details, but by instead considering the user experience. I’ve given a realization of what UX thinking might give you. But even if this specific command line UI is not adopted, I stress that UX thinking should be used when implementing any changes to the command line task architecture.
I also think that tasks should be viewed and designed as a cohesive whole. Tasks shouldn’t just be created to suit a need and figuratively thrown into a bin/ directory. Tasks should serve as a unified vocabulary for processing data.