Open MPI on Mills

IT provides templates for three job script variants for the openmpi parallel environment on Mills located in /opt/shared/templates/gridengine/openmpi called:

You may copy and customize these templates to provide the best performance when running your Open MPI applications. See Running Applications on Mills for details about resources. The options you select can best be understood by reading about Mills tuning and threading performance as it relates to the resources and effects on your applications.

It is a good idea to periodically check in /opt/shared/templates/gridengine/openmpi for changes in existing templates, or the addition of new templates, designed to provide the best performance on Mills.

PSM (Performance-Scaled Messaging)

Performance-Scaled Messaging (PSM), and is an accelerated interface between MPI libraries and the Infiniband network adapter. The PSM software uses hardware contexts to provide the direct interface between an MPI library and the Infiniband hardware – and there are a limited number of contexts available on a node: 16, to be exact. So on a 24 core node there's no one-to-one availability. The default behavior of PSM-aware software is to grab as many of the contexts as possible: so if you end up sharing a node with other MPI programs and those programs:

then there are likely zero PSM contexts available. The library's way of telling you this is the error message you cited:

	ipath_userinit: assign_context command failed: Network is down
	can't open /dev/ipath, network down (err=26)

An error which is not terribly intuitive.

The PSM-aware job script template, openmpi-psm.qs, that IT provides includes a section of BASH code that determines how many PSM contexts are available on each node on which your job is scheduled and sets environment variables to limit PSM usage accordingly.