New Conda Environment
This post is to notify the Science Pipelines community of the implementation of a new conda environment from RFC-679. After some final checks tonight and tomorrow with ci_hsc
, we intend to roll this out tomorrow, April 30th.
The essence of this change is a new conda environment, a new compiler, and an upgrade of our third parties. Details of those changes are below.
Orchestrating a change like this isn’t easy, and while I’ve tried to account for all the details, I may have missed one. If, after the change, you see some issues, please notify me and create a ticket (please use conda
as a component if possible.)
Switching channels to conda-forge
By default, we have switched to conda-forge to supply our third party Python dependencies, rather than the conda distribution (“defaults”).
conda-forge is a “community-led collection of recipes, build infrastructure and distributions for the conda package manager” — fundamentally using conda-build and its metadata to build package recipes, with a process to contribute those recipes, leveraging common CI infrastructure to build and rebuild recipes for multiple platforms to provide compatible binaries for all sorts of packages.
Using conda-forge for (most) third party libraries and compilers.
Significantly, most third party dependencies, including Python and non-Python (C++), come from conda-forge instead of eups now. Most notably, we are now on cfitsio 3.47, and we have merged some code to help with that transition a bit. See DM-24376. This isn’t perfect, but we aim to improve this experience while working within the FITS standard.
We have contributed, and continue to contribute, recipes for third-parties to the conda-forge community to support our software. Our general approach going forward is to push third party or
generally useful software to conda-forge when necessary. For developing recipes to achieve this,
there is extensive documentation on the conda-forge site on how to add or maintain a recipe, and there are lots of helpful people in #dm-conda on Slack.
Using the compilers provided from conda-forge
To support this change to the conda-forge channel, especially on CentOS 7, we needed to switch to conda compilers to ensure compatibility with the provided packages — specifically, those maintained by the conda-forge community. We are using the default “comp7” set of compilers, which are based on GCC 7.3.0 for linux and clang 9.0.1 on macOS.
System Requirements, Installing and Building
We are working to reduce the differences between newinstall
and lsstsw
in addition to
simplifying newinstall
. Much of this work will occur after the change.
On most Linux distributions, you should only need git
, patch
, curl
, and make
installed to
get up and running with lsstsw
or newinstall
.
newinstall.sh defaults to conda compilers
By default, newinstall.sh
in master will assume you will want the conda-system
compiler. This is
a generic term which denotes that conda is providing the compilers. There is a flag (-g
) which can revert to the old behavior for compatibility.
lsstsw
Build manifests (EUPS version lists) now include the conda environment (repo and SHA1) that was activated when the build occurred. This links a published EUPS tag with the conda environment.
Additional Notes
Jenkins
Most notably, Jenkins is dropping centos6 and adding centos8 for builds. We are keeping centos7 as the baseline at present.
With this change, the packer-layercake method of building Docker containers will be deprecated in
favor of Dockerfiles in the lsst-dm/docker-scipipe repo. The Dockerfiles in the base-7
and base-8
directories in that repo will correspond with lsstdm/scipipe-base:7
and lsstdm/scipipe-base:8
respectively, as they currently are in docker hub. docker-newinstall will also be updated to be based off of the lsstdm/scipipe-base:7
image. The infra monthly jobs in jenkins-dm-jobs, which build some containers, will be modified soon after the changes occur.
Modifying a deployed conda environment
We don’t currently encourage users to modify their environment or install extra software, but it is inevitable some users may need to.
Currently, we do not change the condo environment configuration (.condarc
) to add conda-forge to the channels, so if you install additional software on top of your environment, you may want to add conda-forge to your environment’s channels. You can do so from your activated environment by running the following command:
conda config --env --add channels conda-forge
This is especially true as we also do not pin the dependencies in an environment once installed. So if you attempt to conda install
you may get packages updated from the defaults
channel as opposed to the conda-forge
channel. Pinning can be useful to prevent your dependencies from changing should you wish to modify your environment:
conda list > $CONDA_PREFIX/conda-meta/pinned
Adding new third parties
After an RFC is adopted, the process of adding a new third party is simplified. It usually involves
only adding it to the conda bleed file and getting a maintainer of the conda environment repo to regenerate the package’s files, although it’s possible to generate those files yourself.
If the third party library is to be compiled against and needs a sconsUtils config, you must now
create the config in the configs
directory of sconsUtils instead of putting such a file in the
ups
directory of the third party you would have added. See the configs
directory in sconsUtils
for some examples.
Modifying conda-bleed files
Conda bleed files have been slightly modified to have some structure similar to conda’s meta.yaml
requirements sections.
build/host/run sections have been added with some packages moved into those sections. This may help provide some guidance to downstream conda packaging of the stack (e.g. stackvana or others).
The run section is not currently populated, but it is intended for additional software which is not required to build/test/use the stack, but software which is instead required to provide a useful
deployment environment, such as on the lsst-dev “shared-stack”.
Added packages and version changes
For the conda environment, we have had to make a few compatibility pins to get things out the door.
These include:
- boost = 1.70.0
- pybind11 < 2.3
- treecorr < 4
- pyqt < 5.12 (Linux only)
- PyQT is required by matplotlib; this is to avoid a bug in the pyqt recipe which reports additional packages in
conda env export
- PyQT is required by matplotlib; this is to avoid a bug in the pyqt recipe which reports additional packages in
There are a few notable exceptions where we still rely on software in eups which is also available
in conda-forge:
-
eigen: jointcal (and packages depending on jointcal) will still setup the eups eigen package.
-
psfex: Our version of psfex has changed substantially from the version available from astromatic in conda-forge (astromatic-psfex) and they are not equivalent. Our fork has had autotools/Makefile changes from the upstream project backported to improve the builds with conda-forge.
This commit shows which packages have moved into conda-forge. These changes are summarized below.
- apr 1.5.2→1.6.5
- autograd 1.1→1.3
- boost 1.69→1.70.0
- cfitsio 3.360→3.470
- eigen (See Note [1]) 3.3.9→3.3.9
- esutil 0.63→0.64
- galsim 2.2.1→2.2.3
- gsl 2.6→2.6
- healpy 1.10.3→1.13.0
- libaprutil (apr_util) 1.5.4→1.6.1
- lapack (new; required to build psfex)
- lmfit 0.9.3→1.0.0
- log4cxx 0.10.0→0.10.0
- lsstdesc.coord (coord) 1.1→1.2.1
- minuit2 5.34→6.18.00
- mpi4py 3.0.0→3.0.3
- mpich 3.2.1→3.3.2
- ndarray 1.5.3→1.5.3
- pybind11 2.2.4→2.2.4
- starlink-ast 1.3.8→9.1.0
- treecorr 3.2.3→3.3.11
- wcslib 5.13→7.2
- ws4py 0.4.2→0.5.1
- xpa 2.1.15→2.1.20
Notes:
- eigen from conda is used if jointcal is not set up. If jointcal is setup, eigen from eups will supersede conda-forge eigen.