I’ve just merged DM-25919, which both renames and modifies some of the Gen3 butler query methods that many of you are already familiar with.
First, the bad news (breaking changes):
- 
Registry.queryDimensionshas been renamed toqueryDataIds.
- 
The expandargument has been removed fromqueryDimensions/queryDataIdsandqueryDatasets.
And now the good news:
- 
queryDataIdsandqueryDatasetsare now faster by many orders of magnitude for large queries, at least by default (performance is similar to the old speed withexpand=False, butexpand=Truewas the default).
- 
There is a new method, queryDimensionRecords, which returns metadata rows for a dimension directly, and is hence a much more convenient interface for that purpose (compared to the old approach of querying for data IDs, and then accessing.recordson those).
- 
queryDataIdsandqueryDatasetsnow return custom iterator objects (DataCoordinateQueryResultsandDatasetQueryResults) with many extra methods, most of which return new result objects (it’s a “method chaining” interface, for those of you familiar with that concept). Those include anexpandedmethod that replaces the oldexpand=Truekeyword argument (but without the enormous performance penalty), afindDatasetsmethod to do bulk searches for datasets whose data IDs were identified by the original query, and amaterializecontext manager that stores the results in a temporary table in the database, allowing follow-up related queries without having to nest (and hence possibly re-execute) the original query as a subquery or round-trip the results through Python objects. These result objects are all still lazy iterators that don’t execute the query until iteration begins; we don’t want to assume users always want to fetch all results and stuff them in a container, even if that’s often the case. They do havetoSetandtoSequencemethods that make fetching into Python containers easy when desired.
As documented on DM-24938, these changes make the parts of QuantumGraph generation that they were intended to optimize dramatically faster, but they make what is actually the bottleneck slightly slower, so there’s little overall change in performance. But they also set the stage for optimizing that bottleneck in the same way (on DM-24432, my current project), so I’m optimistic that we’ll soon get QuantumGraph generation down from approximately an hour (per tract, on HSC) down to 10-15 minutes.
I’ll add API doc links to the text above once the weekly docs are built. User guide docs for this functionality is not yet written; there’s some more functionality I’d like to add first.