Dependencies and other issues in using DM code in quasi-real-time processes

gpdf · July 1, 2016, 3:03am

In a Confluence discussion on the WCS redesign I asked about dependencies of the proposed code in the light of its possible use in the TCS. In reply, @parejkoj properly noted that I was taking the discussion away from its original intent; respecting that, and because it’s really a larger concern, I’ll try continuing it here.

In that thread, @jbosch wrote:

It’s hard to imagine that the requirements for a lightweight install at the summit could be more restrictive than what a typical level 3 science user would tolerate […]

That’s a different perspective. I’d have said that a Level 3 user would be quite possibly more like the model of a maximal install, at least as far as algorithmic code is concerned. (It may be different for middleware; e.g., they may not be interested in the heavy-duty workflow framework unless they are setting up their own large scale productions.)

My past experience has been that trying to keep quasi-real-time and/or high-reliability processes tight is generally a good engineering principle. I wonder if we are just seeing this from very different angles.

By the way, I’m thinking both of code dependencies and, though I didn’t make this clear, of run-time dependencies that can easily (though need not, with good design) accrete the larger the transitive closure of the code base becomes. More shared libraries ==> more dynamic loads, more traversals of search paths, more system calls, slower startup of applications. More code included ==> lower probability that the final application author has a full understanding of the environment that is required and the surprising things that may happen under the covers - e.g., unexpected I/Os loading configuration and conditions data associated with big frameworks.

This conversation seems worth continuing, as we appear to be heading to a phase of the project when it is becoming more likely that DM code will be appearing in a variety of summit processes. (See, for instance, the discussions at the recent Camera workshop, which I’ll write up elsewhere.)

Similar concerns arise, for instance, in the question of whether DM code will be used/usable in the centroiding of the stars observed by the guide sensors, and of how much of the usual application framework can/must be stripped away to allow the actual analysis to run in the few milliseconds available.

It is relevant to our design for packaging and deployment of the DM software whether our vision is that people should simply quasi-monolithically install “the stack” or whether it is possible to create narrow slices as appropriate. This is also relevant if we wish to share elements of the architecture we are developing (e.g., the Butler) with other projects without requiring them to absorb all of DM or even all of afw. It’s relevant to whether there are any parts of our C++ software that are useful without the Python, and vice versa.

One way to address this is to wait until very concrete use cases appear - e.g., the construction of the guider system software - and then tackle the issue as seems necessary. It may just be too complicated and fuzzy to attempt to reason about it without at least a few guiding examples. If so, I feel like we should run through a few such examples soon-ish (certainly by 2017).

parejkoj · July 1, 2016, 8:49am

@rowen wrote PyGuide based on a fairly sophisticated algorithm from Jim Gunn and Connie Rockosi. It does not depend on any stack software (was not written for LSST), and has very few other dependencies (basically scipy and numpy). It is quite fast–we replaced Jim and Connie’s hand-tweaked C code with it at the SDSS 2.5m at APO and it’s slightly faster–and could serve as a minimal guiding framework to build on. Plus, the primary author is part of the project.

Relatedly, Russell also wrote the new (and old, but that’s not relevant here) TCC for the SDSS telescope. It’s been running at the APO 2.5m for about a year, and on the 3.5m for longer. It’s mostly python. I don’t know how the TCS requirements for LSST compare or whether this was considered a possibility.

gpdf · July 7, 2016, 1:00am

I think the assumption so far has been that the TCS group was going to write the higher levels of their control loops themselves, largely from scratch. I don’t explicitly know whether @rowen’s previous work is being looked at by the TCS team.

The context of my post was that there is a slowly developing consensus that the non-DM components of LSST should, when reasonable and feasible, use DM stack code in preference to rewriting algorithmic code. This has been said about the centroiding, in particular. So the question is whether the packaging and interfaces of the DM algorithmic code are at present well-suited to “I receive in memory 8 50x50 raw images from the Camera every 1/9 second and I need to find a known source and its centroid for each image (presence of a star and data quality permitting, of course!) in a small and stable number of milliseconds after receipt.”

It’s easy to imagine, for instance, that we could readily handle the throughput but not the latency - e.g., because of some startup overhead for the processing of each image, acceptable in the context of a system in which we expect to find large numbers of sources on every image, such that the per-image overhead is negligible in comparison.

parejkoj · July 7, 2016, 1:44am

I don’t know whether the DM stack is suited to that particular task (centroiding guider images from objects in memory in a short amount of time).

I am pretty sure that PyGuide can do that (find centroids and give some measured star properties), with just a few lines of python, and likely at the required speed (my profiling test on SDSS 15 guide fibers gives ~80ms total for finding, centroiding, and determining shapes). This is assuming that it really can just be given a numpy array of the desired pixels (e.g. that python doesn’t need to be started from scratch). I understand that you may want some other things that the stack can provide, but in this particular case it may not be necessary, and the code already exists and is readily useable.

And since an example may be helpful, here’s a basic “guide step”, finding and centroiding the stars in an image (numpy array). I realize this isn’t answering the question you’re asking, but it’s an already developed algorithm that may be suitable for purpose.

# describe the CCD: only need to do this once
ccdInfo = PyGuide.CCDInfo(imageBias, readNoise, ccdGain)
# find all stars brighter than some threshold
stars = PyGuide.findStars(image, mask, saturatedMask, ccdInfo, thresh=5)
# the shape of the brightest star in the image
shape = PyGuide.StarShape.starShape(image, mask, star[0][0].xyCtr, 100)