Table of Contents

Mpi4py for Farber

MPI for Python (mpi4py) provides bindings of the Message Passing Interface (MPI) standard for the Python programming language, allowing any Python program to exploit multiple processors.

$ vpkg_versions python-mpi4py
Available versions in package (* = default version):
 
[/opt/shared/valet/2.0.1/etc/python-mpi4py.vpkg_json]
python-mpi4py        Fundamental library for scientific computing
* 1.3.1-python2.7.8  Version 1.3.1 and python 2.7.8
  1.3.1-python3.2.5  Version 1.3.1 and python 3.2.5
  3.0.3-python3.6.3  Version 3.0.3 and python 3.6.3
  python2.7.8        alias to python-mpi4py/1.3.1-python2.7.8
  python3.2.5        alias to python-mpi4py/1.3.1-python3.2.5
  python3.6.3        alias to python-mpi4py/3.0.3-python3.6.3

Sample mpi4py script

Adapted from the documentation provided by NASA Modeling Guru consider the mpi4py script that implements the scatter-gather procedure based on Python 2:

scatter-gather.py
#--------------
# Loaded Modules
#--------------
import numpy as np
from mpi4py import MPI
 
#-------------
# Communicator
#-------------
comm = MPI.COMM_WORLD
 
my_N = 4
N = my_N * comm.size
 
if comm.rank == 0:
    A = np.arange(N, dtype=np.float64)
else:
    #Note that if I am not the root processor A is an empty array
    A = np.empty(N, dtype=np.float64)
 
my_A = np.empty(my_N, dtype=np.float64)
 
#-------------------------
# Scatter data into my_A arrays
#-------------------------
comm.Scatter( [A, MPI.DOUBLE], [my_A, MPI.DOUBLE] )
 
if comm.rank == 0:
   print "After Scatter:"
 
for r in xrange(comm.size):
    if comm.rank == r:
        print "[%d] %s" % (comm.rank, my_A)
    comm.Barrier()
 
#-------------------------
# Everybody is multiplying by 2
#-------------------------
my_A *= 2
 
#-----------------------
# Allgather data into A again
#-----------------------
comm.Allgather( [my_A, MPI.DOUBLE], [A, MPI.DOUBLE] )
 
if comm.rank == 0:
   print "After Allgather:"
 
for r in xrange(comm.size):
    if comm.rank == r:
        print "[%d] %s" % (comm.rank, A)
    comm.Barrier()

Batch job

Any MPI job requires you to use mpirun to initiate it, and this should be done through the Grid Engine job scheduler to best utilize the resources on the cluster. Also, if you want to run on more than 1 node (more than 20+ cores), then you must initiate a batch job from the head node. Remember if you only have 1 node in your workgroup, then you would need to take advantage of the standby queue to be able to run a job utilizing multiple nodes.

The best results on Farber have been found by using the openmpi-ib.qs template for Open MPI jobs. For example, copy the template and call it mympi4py.qs for the job script using

cp /opt/shared/templates/gridengine/openmpi/openmpi-ib.qs mympi4py.qs

and modify it for your application. Make sure you read the comments in the job script to select the appropriate option specifically modify the NPROC to specify the number of cores for #$ -pe mpi NPROC and understand you get 1GB of memory per NPROC (cores). Also if you specify exclusive access by using -l exclusive=1, then no other jobs can be running on the nodes, giving exclusive access to your job. Make sure you specify the correct VALET environment for your job selecting the correct version for Python 2 or 3 for mpi4py. Since the above example is based on Python 2 and needs mpi4py, we will specify the VALET package as follows:

vpkg_require python-mpi4py/python2.7.8

Lastly, modify MY_EXE to run Python (either python or python3) and MY_EXE_ARGS should be defined as the Python script to run. In this example, it would be as follows for Python 2:

MY_EXE="python"
MY_EXE_ARGS=("scatter-gather.py")

All the options for mpirun will automatically be determined based on the options you selected for your job script. Now to run this job, from the head node, Farber, first make sure you are in your workgroup

workgroup -g //investing-entity//

then simple submit your job using

qsub mympi4py.qs

or

qsub -l exclusive=1 mympi4py.qs

for exclusive access of the nodes needed for your job.

Remember if you want to specify more cores for NPROC than available in your workgroup, then you need to specify the standby queue -l standby=1 but remember it has limited run times based on the total number of cores you are using for all of your jobs. In this example, since it is only requesting 48 cores, then it can run up to 8 hours using

qsub -l exclusive=1 -l standby=1 mympi4py.qs

Output

The following output is based on the Python 2 script scatter-gather.py submitted with 4 cores $# -pe mpi 4 and 1GB of memory per core in the mympi4py.qs as described above:

[CGROUPS] UD Grid Engine cgroup setup commencing
[CGROUPS] WARNING: No OS-level core-binding can be made for mpi jobs
[CGROUPS] Setting 1073741824 bytes (vmem none bytes) on n039 (master)
[CGROUPS]   with 4 cores
[CGROUPS] done.
 
Adding dependency `python/2.7.8` to your environment
Adding dependency `openmpi/1.8.2` to your environment
Adding package `python-mpi4py/1.3.1-python2.7.8` to your environment
Adding dependency `atlas/3.10.2` to your environment
Adding package `python-numpy/1.8.2-python2.7.8` to your environment
GridEngine parameters:
  mpirun        = /opt/shared/openmpi/1.8.2/bin/mpirun
  nhosts        = 1
  nproc         = 4
  executable    = python
  Open MPI vers = 1.8.2
  MPI flags     = --display-map --mca btl ^tcp
-- begin OPENMPI run --
 Data for JOB [64887,1] offset 0
 
 ========================   JOB MAP   ========================
 
 Data for node: n039    Num slots: 4    Max slots: 0    Num procs: 4
        Process OMPI jobid: [64887,1] App: 0 Process rank: 0
        Process OMPI jobid: [64887,1] App: 0 Process rank: 1
        Process OMPI jobid: [64887,1] App: 0 Process rank: 2
        Process OMPI jobid: [64887,1] App: 0 Process rank: 3
 
 =============================================================
[2] [  8.   9.  10.  11.]
After Scatter:
[0] [ 0.  1.  2.  3.]
[1] [ 4.  5.  6.  7.]
[3] [ 12.  13.  14.  15.]
[3] [  0.   2.   4.   6.   8.  10.  12.  14.  16.  18.  20.  22.  24.  26.  28.
  30.]
[1] [  0.   2.   4.   6.   8.  10.  12.  14.  16.  18.  20.  22.  24.  26.  28.
  30.]
After Allgather:
[0] [  0.   2.   4.   6.   8.  10.  12.  14.  16.  18.  20.  22.  24.  26.  28.
  30.]
[2] [  0.   2.   4.   6.   8.  10.  12.  14.  16.  18.  20.  22.  24.  26.  28.
  30.]
-- end OPENMPI run --

Recipes

If you need to build a Python virtualenv based on a collection of Python modules including mpi4py, then you will need to follow this recipe to get a properly-integrated mpi4py module.