====== Mpi4py for Caviness ======
MPI for Python ([[http://mpi4py.scipy.org/docs/|mpi4py]]) provides bindings of the Message Passing Interface (MPI) standard for the Python programming language, allowing any Python program to exploit multiple processors.
$ vpkg_versions python-mpi
Available versions in package (* = default version):
[/opt/shared/valet/2.1/etc/python-mpi.vpkg_yaml]
python-mpi Python MPI and dependent Modules
* 2.7.15:20180613 Python MPI modules with Open MPI + system compiler
3.6.5:20180613 Python MPI modules with Open MPI + system compiler
===== Sample mpi4py script =====
Adapted from the documentation provided by [[https://modelingguru.nasa.gov/docs/DOC-2412|NASA Modeling Guru]] consider the mpi4py script that implements the [[https://github.com/jbornschein/mpi4py-examples/blob/master/03-scatter-gather|scatter-gather procedure]]:
#--------------
# Loaded Modules
#--------------
import numpy as np
from mpi4py import MPI
#-------------
# Communicator
#-------------
comm = MPI.COMM_WORLD
my_N = 4
N = my_N * comm.size
if comm.rank == 0:
A = np.arange(N, dtype=np.float64)
else:
#Note that if I am not the root processor A is an empty array
A = np.empty(N, dtype=np.float64)
my_A = np.empty(my_N, dtype=np.float64)
#-------------------------
# Scatter data into my_A arrays
#-------------------------
comm.Scatter( [A, MPI.DOUBLE], [my_A, MPI.DOUBLE] )
if comm.rank == 0:
print "After Scatter:"
for r in xrange(comm.size):
if comm.rank == r:
print "[%d] %s" % (comm.rank, my_A)
comm.Barrier()
#-------------------------
# Everybody is multiplying by 2
#-------------------------
my_A *= 2
#-----------------------
# Allgather data into A again
#-----------------------
comm.Allgather( [my_A, MPI.DOUBLE], [A, MPI.DOUBLE] )
if comm.rank == 0:
print "After Allgather:"
for r in xrange(comm.size):
if comm.rank == r:
print "[%d] %s" % (comm.rank, A)
comm.Barrier()
===== Batch job =====
Any MPI job requires you to use ''mpirun'' to initiate it, and this should be done through the Slurm job scheduler to best utilize the resources on the cluster. Also, if you want to run on more than 1 node (more than 36+ cores), then you must initiate a batch job from the head node. Remember if you only have 1 node in your workgroup, then you would need to take advantage of the [[abstract:caviness:runjobs:queues#the-standard-partition|standard]] partition to be able to run a job utilizing multiple nodes, however keep in mind using the standard partition means your job can be preempted so you will need to be mindful of [[abstract:caviness:runjobs:schedule_jobs#handling-system-signals-aka-checkpointing|checkpointing]] your job.
The best results have been found by using the //openmpi.qs// template for [[/software/openmpi/openmpi|Open MPI]] jobs. For example, copy the template and call it ''mympi4py.qs'' for the job script using
cp /opt/shared/templates/slurm/generic/mpi/openmpi/openmpi.qs mympi4py.qs
and modify it for your application. There are several ways to communicate the number and layout of worker processes. In this example, we will modify the job script to specify a single node and 4 cores using ''#SBATCH --nodes=1'' and ''#SBATCH --ntasks=4''. It is important to to carefully read the comments and select the appropriate options for your job. Make sure you specify the correct VALET environment for your job selecting the correct version for Python 2 or 3 for python-mpi. Since the above example is based on Python 2, we will specify the VALET package as follows:
vpkg_require python-mpi/2.7.15:20180613
Lastly, modify the section to execute your MPI program.gh In this example, it would be
${UD_MPIRUN} python scatter-gather.py
All the options for ''mpirun'' will automatically be determined based on the options you selected above for your job script. Now to run this job, from the head node, Mills, simply use
sbatch mympi4py.qs
The following output is based on the Python 2 script ''scatter-gather.py'' submitted with 4 cores and 1GB of memory per core in the ''mympi4py.qs'' as described above:
Adding dependency `python/2.7.15` to your environment
Adding dependency `libfabric/1.6.1` to your environment
Adding dependency `openmpi/3.1.0` to your environment
Adding package `python-mpi/2.7.15:20180613` to your environment
-- Open MPI job setup complete (on r03n33):
-- mpi job startup = /opt/shared/openmpi/3.1.0/bin/mpirun
-- nhosts = 1
-- nproc = 4
-- nproc-per-node = 4
-- cpus-per-proc = 1
-- Open MPI environment flags:
-- OMPI_MCA_btl_base_exclude=tcp
-- OMPI_MCA_rmaps_base_display_map=true
-- OMPI_MCA_orte_hetero_nodes=true
-- OMPI_MCA_hwloc_base_binding_policy=core
-- OMPI_MCA_rmaps_base_mapping_policy=core
Data for JOB [51033,1] offset 0 Total slots allocated 4
======================== JOB MAP ========================
Data for node: r03n33 Num slots: 4 Max slots: 0 Num procs: 4
Process OMPI jobid: [51033,1] App: 0 Process rank: 0 Bound: socket 0[cor
e 0[hwt 0]]:[B/././.]
Process OMPI jobid: [51033,1] App: 0 Process rank: 1 Bound: socket 0[cor
e 1[hwt 0]]:[./B/./.]
Process OMPI jobid: [51033,1] App: 0 Process rank: 2 Bound: socket 0[cor
e 2[hwt 0]]:[././B/.]
Process OMPI jobid: [51033,1] App: 0 Process rank: 3 Bound: socket 0[cor
e 3[hwt 0]]:[./././B]
=============================================================
After Scatter:
[0] [0. 1. 2. 3.]
[1] [4. 5. 6. 7.]
[2] [ 8. 9. 10. 11.]
[3] [12. 13. 14. 15.]
After Allgather:
[0] [ 0. 2. 4. 6. 8. 10. 12. 14. 16. 18. 20. 22. 24. 26. 28. 30.]
[1] [ 0. 2. 4. 6. 8. 10. 12. 14. 16. 18. 20. 22. 24. 26. 28. 30.]
[2] [ 0. 2. 4. 6. 8. 10. 12. 14. 16. 18. 20. 22. 24. 26. 28. 30.]
[3] [ 0. 2. 4. 6. 8. 10. 12. 14. 16. 18. 20. 22. 24. 26. 28. 30.]
===== Recipes =====
If you need to build a Python virtualenv based on a collection of Python modules including mpi4py, then you will need to follow this recipe to get a properly-integrated mpi4py module.
* [[technical:recipes:mpi4py-in-virtualenv|Building a Python virtualenv with a properly-integrated mpi4py module]]