Most conda channels include copies of the mpi4py module to satisfy dependencies of MPI-parallelized packages. But the mpi4py Python code must be built on top of a native MPI library (like MPICH, Open MPI, Intel MPI). As a result, the conda packages always include a bundled binary MPI library that was built to generic specifications: often without support for Infiniband communications or Slurm/Grid Engine integration support. For proper functioning it's recommended that mpi4py always be built on top of one of the MPI libraries IT-RCI provides on a cluster.
In this example we will build the virtual environment on Farber using the openmpi/4.0.5
version of Open MPI and Anaconda for the virtual environment:
$ vpkg_require openmpi/4.0.5 anaconda/5.2.0:python3 Adding dependency `ucx/1.9.0` to your environment Adding package `openmpi/4.0.5` to your environment Adding package `anaconda/5.2.0:python3` to your environment
Due to recent announcements regarding Anaconda, and Intel dropping their distribution channel, any documentation referring to Intel's channel will need to be updated.
Please use conda-forge
channel for installations.
We will be creating a Python virtual environment containing Numpy and Scipy libraries into which mpi4py will be added. In case we will need to create additional similar environments in the future, we will setup a directory hierarchy that allows multiple versions to coexist:
$ mkdir -p ${HOME}/conda-envs/my-sci-app/20201102
Two things to note:
${HOME}
could be replaced by ${WORKDIR}/users/myname
, for example, to create it elsewhere.YYYYMMDD
promotes simple sorting of the versions from oldest to newest.
The directory structure will lend my-sci-app
to straightforward management using VALET.
The virtual environment is first populated with all packages that do not require mpi4py. Any packages requiring mpi4py must be installed after we build and install our local copy of mpi4py in the virtual environment. In this example, neither Numpy nor Scipy require mpi4py.
The two channel options are present to ensure only the default Anaconda channels are consulted – otherwise the command could still pick packages from the Intel channel, for example, which would still have the binary compatibility issues!
$ conda create --prefix ${HOME}/conda-envs/my-sci-app/20201102 --channel defaults --override-channels python'=>3.7' numpy scipy Solving environment: done : Proceed ([y]/n)? y : Preparing transaction: done Verifying transaction: done Executing transaction: done # # To activate this environment, use: # > source activate /home/1001/conda-envs/my-sci-app/20201102 # # To deactivate an active environment, use: # > source deactivate #
Before building and installing mpi4py the environment needs to be activated:
$ source activate /home/1001/conda-envs/my-sci-app/20201102 (/home/1001/conda-envs/my-sci-app/20201102)$
With the new virtual environment activated, we can now build mpi4py against the local Open MPI library we added to the shell environment.
(/home/1001/conda-envs/my-sci-app/20201102)$ pip install --no-binary :all: --compile mpi4py Collecting mpi4py Using cached mpi4py-3.0.3.tar.gz (1.4 MB) Skipping wheel build for mpi4py, due to binaries being disabled for it. Installing collected packages: mpi4py Running setup.py install for mpi4py ... done Successfully installed mpi4py-3.0.3
The –no-binary :all:
flag prohibits the installation of any packages that include binary components, effectively forcing a rebuild of mpi4py from source. The –compile
flag pre-processes all Python scripts in the mpi4py package (versus allowing them to be processed and cached later). The environment now includes support for mpi4py linked against the openmpi/4.0.5
library on Farber:
(/home/1001/conda-envs/my-sci-app/20201102)$ pip list | grep mpi4py mpi4py 3.0.3
Additional packages that require mpi4py can now be installed into the environment.
The new virtual environment can easily be added to your login shell and job runtime environments using VALET. First, ensure you have your personal VALET package definition directory present:
$ mkdir -p ${HOME}/.valet $ echo ${HOME}/conda-envs/my-sci-app /home/1001/conda-envs/my-sci-app
Take note of the path echoed, then create a new file named ${HOME}/.valet/my-sci-app.vpkg_json
and add the following text to it:
{ "my-sci-app": { "prefix": "/home/1001/conda-envs/my-sci-app", "description": "Some scientific app project in Python", "standard-paths": false, "actions": [ { "action": "source", "order": "failure-first", "success": 0, "script": { "sh": "anaconda-activate.sh" } } ], "versions": { "20201102": { "description": "environment built Nov 2, 2020", "dependencies": [ "openmpi/4.0.5", "anaconda/5.2.0:python3" ] } } } }
Please note:
prefix
path will be different for youprefix
containing that versiondependencies
list accordinglyversions
dictionary:"versions": { "20201102": { "description": "environment built Nov 2, 2020", "dependencies": [ "openmpi/4.0.5", "anaconda/5.2.0:python3" ] }, "20201114": { "description": "environment built Nov 14, 2020", "dependencies": [ "openmpi/3.1.6", "anaconda/5.2.0:python3" ] } }
The versions of the virtual environment declared in the VALET package are listed using the vpkg_versions
command:
$ vpkg_versions my-sci-app Available versions in package (* = default version): [/home/1001/.valet/my-sci-app.vpkg_json] my-sci-app Some scientific app project in Python * 20201102 environment built Nov 2, 2020
Activating the virtual environment is accomplished using the vpkg_require
command (in your login shell or inside job scripts):
$ vpkg_require my-sci-app/20201102 Adding dependency `ucx/1.9.0` to your environment Adding dependency `openmpi/4.0.5` to your environment Adding dependency `anaconda/5.2.0:python3` to your environment Adding package `my-sci-app/20201102` to your environment (/home/1001/conda-envs/my-sci-app/20201102)$ which python3 ~/conda-envs/my-sci-app/20201102/bin/python3 (/home/1001/conda-envs/my-sci-app/20201102)$ pip list | grep mpi4py mpi4py 3.0.3 $ which mpirun /opt/shared/openmpi/4.0.5/bin/mpirun
The steps for completing this work on Caviness are similar to those presented for Farber and of course following the first part to create a directory hierarchy. We will instead use the Intel Python distribution:
$ vpkg_require openmpi/4.1.4:gcc-12.1.0 anaconda/2024.02 Adding dependency `libfabric/1.13.2` to your environment Adding dependency `binutils/2.35` to your environment Adding dependency `gcc/12.1.0` to your environment Adding package `openmpi/4.1.4:gcc-12.1.0` to your environment Adding package `anaconda/2024.02` to your environment
The virtual environment is first populated with all packages that do not require mpi4py. Any packages requiring mpi4py must be installed after we build and install our local copy of mpi4py in the virtual environment. In this example, neither Numpy nor Scipy require mpi4py.
$ conda create --prefix ${HOME}/conda-envs/my-sci-app/20201102 --channel defaults --override-channels python'=>3.7' numpy scipy Collecting package metadata (current_repodata.json): done Solving environment: done : Proceed ([y]/n)? y : # # To activate this environment, use # # $ conda activate /home/1001/conda-envs/my-sci-app/20201102 # # To deactivate an active environment, use # # $ conda deactivate
Before building and installing mpi4py the environment needs to be activated:
$ conda activate /home/1001/conda-envs/my-sci-app/20201102 (/home/1001/conda-envs/my-sci-app/20201102)$
With the new virtual environment activated, we can now build mpi4py against the local Open MPI library we added to the shell environment. Due to Anaconda trying to use a version of ld
as part of the virtual environment in lieu of the system ld
, you need to change the permissions to allow the compile to work properly.
(/home/1001/conda-envs/my-sci-app/20201102)$ chmod 000 /home/1001/conda-envs/my-sci-app/20201102/compiler_compat/ld (/home/1001/conda-envs/my-sci-app/20201102)$ pip install --no-binary :all: --compile mpi4py Collecting mpi4py Using cached mpi4py-4.0.1.tar.gz (466 kB) Skipping wheel build for mpi4py, due to binaries being disabled for it. Installing collected packages: mpi4py Running setup.py install for mpi4py ... done Successfully installed mpi4py-4.0.1
The –no-binary :all:
flag prohibits the installation of any packages that include binary components, effectively forcing a rebuild of mpi4py from source. The –compile
flag pre-processes all Python scripts in the mpi4py package (versus allowing them to be processed and cached later). The environment now includes support for mpi4py linked against the openmpi/4.1.4:gcc-12.1.0
library on Caviness:
(/home/1001/conda-envs/my-sci-app/20201102)$ pip list | grep mpi4py mpi4py 4.0.1
Additional packages that require mpi4py can now be installed into the environment.
The new virtual environment can easily be added to your login shell and job runtime environments using VALET. First, ensure you have your personal VALET package definition directory present:
$ mkdir -p ${HOME}/.valet $ echo ${HOME}/conda-envs/my-sci-app /home/1001/conda-envs/my-sci-app
Take note of the path echoed, then create a new file named ${HOME}/.valet/my-sci-app.vpkg_yaml
and add the following text to it:
my-sci-app: prefix: /home/1001/conda-envs/my-sci-app description: Some scientific app project in Python flags: - no-standard-paths actions: - action: source script: sh: anaconda-activate.sh order: failure-first success: 0 versions: "20201102": description: environment built Nov 2, 2020 dependencies: - openmpi/4.0.2 - intel-python/2020u2:python3
The versions of the virtual environment declared in the VALET package are listed using the vpkg_versions
command:
$ vpkg_versions my-sci-app Available versions in package (* = default version): [/home/1001/.valet/my-sci-app.vpkg_yaml] my-sci-app Some scientific app project in Python * 20201102 environment built Nov 2, 2020
Activating the virtual environment is accomplished using the vpkg_require
command (in your login shell or inside job scripts):
$ vpkg_require my-sci-app/20201102 Adding dependency `libfabric/1.9.0` to your environment Adding dependency `openmpi/4.0.2` to your environment Adding dependency `intel-python/2020u2:python3` to your environment Adding package `my-sci-app/20201102` to your environment (/home/1001/conda-envs/my-sci-app/20201102)$ which python3 ~/conda-envs/my-sci-app/20201102/bin/python3 (/home/1001/conda-envs/my-sci-app/20201102)$ pip list | grep mpi4py mpi4py 3.0.3 $ which mpirun /opt/shared/openmpi/4.0.2/bin/mpirun
The steps for completing this work on DARWIN are similar to those presented for Caviness and of course following the first part to create a directory hierarchy. We will instead use the Intel oneAPI Python distribution:
$ vpkg_require openmpi/5.0.2:intel-oneapi-2024 intel-oneapi/2024 Adding dependency `gcc/12.2.0` to your environment Adding dependency `intel-oneapi/2024.0.1.46` to your environment Adding dependency `ucx/1.13.1` to your environment Adding package `openmpi/5.0.2:intel-oneapi-2024` to your environment
The virtual environment is first populated with all packages that do not require mpi4py. Any packages requiring mpi4py must be installed after we build and install our local copy of mpi4py in the virtual environment. In this example, neither Numpy nor Scipy require mpi4py.
$ conda create --prefix ${HOME}/conda-envs/my-sci-app/20240307 --channel intel --override-channels python'=>3.9' numpy scipy Collecting package metadata (current_repodata.json): done Solving environment: done : Proceed ([y]/n)? y : # # To activate this environment, use # # $ conda activate /home/1006/conda-envs/my-sci-app/20240307 # # To deactivate an active environment, use # # $ conda deactivate
Before building and installing mpi4py the environment needs to be activated:
$ conda activate /home/1006/conda-envs/my-sci-app/20240307 (/home/1006/conda-envs/my-sci-app/20240307)$
With the new virtual environment activated, we can now build mpi4py against the local Open MPI library we added to the shell environment.
(/home/1006/conda-envs/my-sci-app/20240307)$ pip install --no-binary :all: --compile mpi4py Collecting mpi4py Downloading mpi4py-3.1.5.tar.gz (2.5 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 2.5/2.5 MB 16.7 MB/s eta 0:00:00 Installing build dependencies ... done Getting requirements to build wheel ... done Preparing metadata (pyproject.toml) ... done Building wheels for collected packages: mpi4py Building wheel for mpi4py (pyproject.toml) ... done Created wheel for mpi4py: filename=mpi4py-3.1.5-cp310-cp310-linux_x86_64.whl size=634821 sha256=78a58c10acd22b3cf2ebf9e73b445d6775ac29f3f59c37e63bd16e27b7467ba2 Stored in directory: /home/1006/.cache/pip/wheels/18/2b/7f/c852523089e9182b45fca50ff56f49a51eeb6284fd25a66713 Successfully built mpi4py Installing collected packages: mpi4py Successfully installed mpi4py-3.1.5
The –no-binary :all:
flag prohibits the installation of any packages that include binary components, effectively forcing a rebuild of mpi4py from source. The –compile
flag pre-processes all Python scripts in the mpi4py package (versus allowing them to be processed and cached later). The environment now includes support for mpi4py linked against the openmpi/5.0.2:intel-oneapi-2024
library on DARWIN:
(/home/1006/conda-envs/my-sci-app/20240307)$ pip list | grep mpi4py mpi4py 3.1.5
Additional packages that require mpi4py can now be installed into the environment.
The new virtual environment can easily be added to your login shell and job runtime environments using VALET. First, ensure you have your personal VALET package definition directory present:
$ mkdir -p ${HOME}/.valet $ echo ${HOME}/conda-envs/my-sci-app /home/1006/conda-envs/my-sci-app
Take note of the path echoed, then create a new file named ${HOME}/.valet/my-sci-app.vpkg_yaml
and add the following text to it:
my-sci-app: prefix: /home/1006/conda-envs/my-sci-app description: Some scientific app project in Python flags: - no-standard-paths actions: - action: source script: sh: anaconda-activate.sh order: failure-first success: 0 versions: "20240307": description: environment built Mar 7, 2024 dependencies: - openmpi/5.0.2:intel-oneapi-2024 - intel-oneapi/2024
The versions of the virtual environment declared in the VALET package are listed using the vpkg_versions
command:
$ vpkg_versions my-sci-app Available versions in package (* = default version): [/home/1006/.valet/my-sci-app.vpkg_yaml] my-sci-app Some scientific app project in Python * 20240307 environment built Mar 7, 2024
Activating the virtual environment is accomplished using the vpkg_require
command (in your login shell or inside job scripts):
$ vpkg_require my-sci-app/20240307 Adding dependency `gcc/12.2.0` to your environment Adding dependency `intel-oneapi/2024.0.1.46` to your environment Adding dependency `ucx/1.13.1` to your environment Adding dependency `openmpi/5.0.2:intel-oneapi-2024` to your environment Adding package `my-sci-app/20240307` to your environment (/home/1006/conda-envs/my-sci-app/20240307)$ which python3 ~/conda-envs/my-sci-app/20240305/bin/python3 (/home/1006/conda-envs/my-sci-app/20240305)$ pip list | grep mpi4py mpi4py 3.1.5 (/home/1006/conda-envs/my-sci-app/20240305)$ which mpirun /opt/shared/openmpi/5.0.2-intel-oneapi-2024/bin/mpirun