Differences

This shows you the differences between two versions of the page.

--- software:mpi4py:farber [2020-04-22 17:20] – [Batch job] anita
+++ software:mpi4py:farber [2021-04-27 16:21] (current) – external edit 127.0.0.1
@@ Line 17: / Line 17: @@
 ===== Sample mpi4py script =====
-Adapted from the documentation provided by [[https://modelingguru.nasa.gov/docs/DOC-2412|NASA Modeling Guru]] consider the mpi4py script that implements the [[https://github.com/jbornschein/mpi4py-examples/blob/master/03-scatter-gather|scatter-gather procedure]]:
+Adapted from the documentation provided by [[https://modelingguru.nasa.gov/docs/DOC-2412|NASA Modeling Guru]] consider the mpi4py script that implements the [[https://github.com/jbornschein/mpi4py-examples/blob/master/03-scatter-gather|scatter-gather procedure]] based on Python 2:
 <code - scatter-gather.py>
@@ Line 76: / Line 76: @@
 ===== Batch job =====
-Any MPI job requires you to use ''mpirun'' to initiate it, and this should be done through the Grid Engine job scheduler to best utilize the resources on the cluster.  Also, if you want to run on more than 1 node (more than 24 or 48 cores depending on the node specifications), then you must initiate a batch job from the head node. Remember if you only have 1 node in your workgroup, then you would need to take advantage of the [[abstract:farber:runjobs:queues#farber-standby-queues|standby]] queue to be able to run a job utilizing multiple nodes.
+Any MPI job requires you to use ''mpirun'' to initiate it, and this should be done through the Grid Engine job scheduler to best utilize the resources on the cluster.  Also, if you want to run on more than 1 node (more than 20+ cores), then you must initiate a batch job from the head node. Remember if you only have 1 node in your workgroup, then you would need to take advantage of the [[abstract:farber:runjobs:queues#farber-standby-queues|standby]] queue to be able to run a job utilizing multiple nodes.
 The best results on Farber have been found by using the //openmpi-ib.qs// template for [[software:openmpi:farber|Open MPI]] jobs. For example, copy the template and call it ''mympi4py.qs'' for the job script using
 <code bash>
-cp /opt/templates/gridengine/openmpi/openmpi-ib.qs mympi4py.qs
+cp /opt/shared/templates/gridengine/openmpi/openmpi-ib.qs mympi4py.qs
 </code>
-and modify it for your application. Make sure you read the comments in the job script to select the appropriate options. However if you specify [[abstract:farber:runjobs:queues#farber-exclusive-access|exclusive access]] by using ''-l exclusive=1'', then no other jobs can be running on the nodes, giving exclusive access to your job. Make sure you specify the correct VALET environment for your job selecting the correct version for python2 or python3 and openmpi.
+and modify it for your application. Make sure you read the comments in the job script to select the appropriate option specifically modify the ''NPROC'' to specify the number of cores for ''#$ -pe mpi NPROC'' and understand you get 1GB of memory per ''NPROC'' (cores). Also if you specify [[abstract:farber:runjobs:queues#farber-exclusive-access|exclusive access]] by using ''-l exclusive=1'', then no other jobs can be running on the nodes, giving exclusive access to your job. Make sure you specify the correct VALET environment for your job selecting the correct version for Python 2 or 3 for mpi4py. Since the above example is based on Python 2 and needs mpi4py, we will specify the VALET package as follows:
 <code bash>
-vpkg_require python-mpi4py/python3.6.3
+vpkg_require python-mpi4py/python2.7.8
-vpkg_require python-numpy/python3.6.3
 </code>
-Lastly, modify ''MY_EXE'' for your mpi4py script.  In this example, it would be
+Lastly, modify ''MY_EXE'' to run Python (either python or python3) and ''MY_EXE_ARGS'' should be defined as the Python script to run.  In this example, it would be as follows for Python 2:
 <code>
-MY_EXE="python3 scatter-gather.py"
+MY_EXE="python"
+MY_EXE_ARGS=("scatter-gather.py")
 </code>
-All the options for ''mpirun'' will automatically be determined based on the options you selected above for your job script. Now to run this job, from the head node, Mills, simply use
+All the options for ''mpirun'' will automatically be determined based on the options you selected for your job script. Now to run this job, from the head node, Farber, first make sure you are in your workgroup
+<code bash>
+workgroup -g //investing-entity//
+</code>
+then simple submit your job using
 <code bash>
 qsub mympi4py.qs
@@ Line 116: / Line 122: @@
 qsub -l exclusive=1 -l standby=1 mympi4py.qs
 </code>
+==== Output ====
+The following output is based on the Python 2 script ''scatter-gather.py'' submitted with 4 cores ''$# -pe mpi 4'' and 1GB of memory per core in the ''mympi4py.qs'' as described above:
+<code bash>
+[CGROUPS] UD Grid Engine cgroup setup commencing
+[CGROUPS] WARNING: No OS-level core-binding can be made for mpi jobs
+[CGROUPS] Setting 1073741824 bytes (vmem none bytes) on n039 (master)
+[CGROUPS]   with 4 cores
+[CGROUPS] done.
+Adding dependency `python/2.7.8` to your environment
+Adding dependency `openmpi/1.8.2` to your environment
+Adding package `python-mpi4py/1.3.1-python2.7.8` to your environment
+Adding dependency `atlas/3.10.2` to your environment
+Adding package `python-numpy/1.8.2-python2.7.8` to your environment
+GridEngine parameters:
+  mpirun        = /opt/shared/openmpi/1.8.2/bin/mpirun
+  nhosts        = 1
+  nproc         = 4
+  executable    = python
+  Open MPI vers = 1.8.2
+  MPI flags     = --display-map --mca btl ^tcp
+-- begin OPENMPI run --
+ Data for JOB [64887,1] offset 0
+ ========================   JOB MAP   ========================
+ Data for node: n039    Num slots: 4    Max slots: 0    Num procs: 4
+        Process OMPI jobid: [64887,1] App: 0 Process rank: 0
+        Process OMPI jobid: [64887,1] App: 0 Process rank: 1
+        Process OMPI jobid: [64887,1] App: 0 Process rank: 2
+        Process OMPI jobid: [64887,1] App: 0 Process rank: 3
+ =============================================================
+[2] [  8.   9.  10.  11.]
+After Scatter:
+[0] [ 0.  1.  2.  3.]
+[1] [ 4.  5.  6.  7.]
+[3] [ 12.  13.  14.  15.]
+[3] [  0.   2.   4.   6.   8.  10.  12.  14.  16.  18.  20.  22.  24.  26.  28.
+.]
+[1] [  0.   2.   4.   6.   8.  10.  12.  14.  16.  18.  20.  22.  24.  26.  28.
+.]
+After Allgather:
+[0] [  0.   2.   4.   6.   8.  10.  12.  14.  16.  18.  20.  22.  24.  26.  28.
+.]
+[2] [  0.   2.   4.   6.   8.  10.  12.  14.  16.  18.  20.  22.  24.  26.  28.
+.]
+-- end OPENMPI run --
+</code>
+===== Recipes =====
+If you need to build a Python virtualenv based on a collection of Python modules including mpi4py, then you will need to follow this recipe to get a properly-integrated mpi4py module.
+  * [[technical:recipes:mpi4py-in-virtualenv|Building a Python virtualenv with a properly-integrated mpi4py module]]