Pybind11 merged

DM-8467 has been merged to master. This is a comprehensive change affecting almost all packages in lsst_distrib. From now on, all C++ to Python wrappers are generated with Pybind11 Instead of Swig. A new DM Pybind11 Style Guide and a new step-by-step Pybind11 wrapping tutorial are now available and should be followed when writing wrappers.

How does this affect me?

  • If you are using, or developing against, a stable release you won’t be affected at all. The latest, 13.0, release is still based on Swig.
  • If you are developing against the latest weekly, you may continue to use the Swig codebase by using w_2017_10 until you are ready to migrate. Alternatively you can use the 13.0 release instead.
  • If you are developing in Python against master you might see small API changes (see the change log below), but in general should not notice a major difference.
    The current Pybind11 switch attempts to minimize API changes from Swig. But Pybind11 enables more Pythonic API’s for wrapped C++ classes, so you may get nicer interfaces over time.
  • If you are developing in C++ and your code needs to be called from Python, you will have to add Pybind11 wrappers. See the aforementioned documentation.

Change log

While the API changes from Swig have been kept to a minimum, some changes were necessary. Please take note of the following.

All integers are long
All integers returned from C++ routines (e.g. int, unsigned int, long) are long in Python 2 and int in Python 3 (note that this follows Python 3 convention).
Therefore, adapt isinstance tests to check for both int and long.

Most strings are unicode
Most strings returned from C++ routines are unicode in Python 2 and str in Python 3 (note that this also follows Python 3 convention).
Therefore, adapt isinstance tests to check for basestring (add from past.builtins import basestring to make this work with Python 3).

Enums are distinct types
Swig wrapped enums as plain integers, with underscores to indicate scoping (e.g. afw.image.RotType_SKY). With Pybind11 enums are distinct types and are scoped as members (e.g. afw.image.RotType.SKY).
The new behavior is safer: it doesn’t allow you to accidentally call a function that takes an integer and an enum argument with the arguments in the wrong order.
However, in some cases it is now necessary to explicitly cast the enum to an integer. We are working to identify such cases and change the API accordingly.

STL containers are copied into Python equivalents
In Pybind11 all functions that return STL container types now return the equivalent Python containers as a copy:

  • std::pair and std::tuple return tuple;
  • std::list and std::vector return list;
  • std::map and std::unordered_map return dict;
  • std::set and std::unordered_set return set.

This means all special Swig vector and list types are gone, replaced by native Python types. For example, you must replace input = afwImage.vectorMasekdImageF() with input = [].

Moreover all functions that accept std::pair, std::tuple, std::list or std::vector arguments now accept any Python iterable.

Because a copy is returned you cannot change the value and have it propagate back to C++ code. Fortunately, this behaviour was used very little.

In fact, the only case encountered was: lsst.shapelet.MultiShapeletFunction.getComponets().push_back(item) where it was replaced by: lsst.shapelet.MultiShapeletFunction.addComponent(item).

Where this copying causes performance problems the C++ should be modified to take an ndarray instead.

Functions use keyword arguments
Most functions now accept keyword arguments.

Type casts are no longer needed for C++ return types in Python
Unlike Swig, Pybind11 always returns the most derived type. Thus explicit type casts (typically done with the .cast() method) are no longer available in Python and should be removed from your code.

Always use true division
Wrapped C++ types now only implement __truediv__. Thus always use from __future__ import division.

Always use numpy types for afw.schema
Use np.int32 or np.int64 to specify the type of a schema field. This was already needed for Python 3, but is now even more often needed.

All moduleLib files have been removed
These files represented incomplete Swig-wrapped modules and were never intended to be used from outside the package (but they may have been). If you encounter any in your code, please remove them and use the package name instead (e.g. lsst.afw.geomLib should be lsst.afw.geom).

Some exception types have changed
Some methods that previously raised a particular exception type now raise a closely-related exception type, because the exception is now being thrown by C++ code instead of Python code (or vice versa). For example, some code that previously raised KeyError now raises pex.exceptions.NotFoundError (which inherits from LookupError, KeyError's base class, instead).

Additional things to keep in mind

  • Butler types with lazy loading need to be explicitly registered with lsst::daf::persistence::python::register_proxy. We are currently compiling a list of types for which this is needed. In the meantime lazy loading is temporarily turned off by default.
  • Custom C++ exceptions that need to be translated to Python should be registered with lsst::pex::exceptions::declareException.

Credits (a.k.a. find your local expert)

The switch to Pybind11 was the result of a collaborative effort by many people.

  • Russell Owen
  • Krzysztof Findeisen
  • Jim Bosch
  • Fred Moolekamp
  • Serge Monkewitz
  • Pim Schellart

In addition thanks goes to the upstream Pybind11 developers, who were very helpful in implementing new functionality and fixing bugs for us.

5 Likes

@kfindeisen kindly clarifies explicitly that w.2017.11 was the first weekly after the pybind11 changes were merged into master.