====== Programming Environment ======
//This section uses the wiki's [[:#documentation-conventions|documentation conventions]].//
===== Programming models =====
There are two memory models for computing: distributed-memory and shared-memory. In the former, the message passing interface (%%MPI%%) is employed in programs to communicate between processors that use their own memory address space. In the latter, open multiprocessing (OMP) programming techniques are employed for multiple threads (light weight processes) to access memory in a common address space. When your job spans several compute nodes, you must use an MPI model.
Distributed memory systems use single-program multiple-data (SPMD) and multiple-program multiple-data (MPMD) programming paradigms. In the SPMD paradigm, each processor loads the same program image and executes and operates on data in its own address space (different data). It is the usual mechanism for MPI code: a single executable is available on each node (through a globally accessible file system such as $WORKDIR or ''/lustre/scratch''), and launched on each node (through the MPI wrapper command, **mpirun**).
The shared-memory programming model is used on Symmetric Multi-Processor (SMP) nodes such as a single typical compute node (20 or 24 cores, 64 GB memory). The programming paradigm for this memory model is called Parallel Vector Processing (PVP) or Shared-Memory Parallel Programming (SMPP). The former name is derived from the fact that vectorizable loops are often employed as the primary structure for parallelization. The main point of SMPP computing is that all of the processors in the same node share data in a single memory subsystem. There is no need for explicit messaging between processors as with MPI coding.
The SMPP paradigm employs compiler directives (as pragmas in C/C++ and special comments in Fortran) or explicit threading calls (e.g. with Pthreads). The majority of science codes now use OpenMP directives that are understood by most vendor compilers, as well as the GNU compilers.
In cluster systems that have SMP nodes and a high-speed interconnect between them, programmers often treat all CPUs within the cluster as having their own local memory. On a node, an MPI executable is launched on each processor core and runs within a separate address space. In this way, all processor cores appear as a set of distributed memory machines, even though each node has processor cores that share a single memory subsystem.
Clusters with SMPs sometimes employ hybrid programming to take advantage of higher performance at the node-level for certain algorithms that use SMPP (OMP) parallel coding techniques. In hybrid programming, OMP code is executed on the node as a single process with multiple threads (or an OMP library routine is called), while MPI programming is used at the cluster-level for exchanging data between the distributed memories of the nodes.
===== Compiling code =====
Fortran, C, C++, Java and Matlab programs should be compiled on the login node, however if lengthy compiles are required or you want to schedule a job for compilation, you must use the ''devel'' partition with ''salloc'' or ''sbatch'' to make sure you are allocated a compute node with the development tools, libraries, etc. which are needed for compilers. **//All resulting executables should only be run on the compute nodes.//**
===== The compiler suites =====
There are three 64-bit compiler suites that IT generally installs and supports: PGI CDK (Portland Group Inc.'s Cluster Development Kit), Intel Composer XE 2011, and GNU. In addition, IT has installed OpenJDK (Open Java Development Kit), which must only be used on the compute nodes. (Type **vpkg_info openjdk** for more information on OpenJDK.)
The PGI compilers exploit special features of AMD processors. If you use open-source compilers, we recommend the GNU collection.
You can use a [[/software/valet/valet|VALET]] **vpkg_require** command to set the UNIX environment for the compiler suite you want to use. After you issue the corresponding **vpkg_require** command, the compiler path and supporting environment variables will be defined.
A general command for basic source code compilation is:
/compiler//> /compiler_flags//> /source_code_filename>// -o /executable_filename//>
For each compiler suite, the table below displays the compiler name, a link to documentation describing the compiler flags, and the appropriate filename extension for the source code file. The executable will be named **a.out** unless you use the **-o** **/executable_filename//****>** option.
To view the compiler option flags, their syntax, and a terse explanation, execute a compiler command with the **-help** option. Alternatively, read the compiler's **man** pages.
^ PGI ^ VALET command ^ Reference manuals ^ User guides ^
^ ::: | **vpkg_require pgi** | [[http://www.pgroup.com/doc/pgiref.pdf|C, Fortran]] | [[http://www.pgroup.com/doc/pgiug.pdf|C, Fortran]] |
| ^ Compiler ^ Language ^ Common filename extensions ^
| ::: | pgfortran | F90, F95, F2003 | .f, .for, .f90, .f95 |
| ::: | pgf77 | F77 | .f |
| ::: | pgCC | C++ | .C, .cc |
| ::: | pgcc | C | .c |
^ Intel ^ VALET command ^ Reference manuals ^ User guides ^
^ ::: | **vpkg_require intel** | [[http://software.intel.com/sites/products/documentation/hpc/compilerpro/en-us/cpp/lin/main_cls_lin.pdf|C]], [[http://software.intel.com/sites/products/documentation/hpc/compilerpro/en-us/fortran/lin/main_for_lin.pdf|Fortran]] | [[http://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=2&ved=0CDAQFjAB&url=http%3A%2F%2Fsoftware.intel.com%2Ffile%2F6320&ei=lPfUTuutOKTX0QHZsJmKAg&usg=AFQjCNEk20j03jlsNspzRyYvAEXOeT7aTA|C]], [[http://software.intel.com/sites/products/documentation/hpc/compilerpro/en-us/fortran/lin/main_for_lin.pdf|Fortran]] |
| ^ Compiler ^ Language ^ Common filename extensions ^
| ::: | ifort | F77, F90, F95 | .f, .for, .f90, .f95 |
| ::: | icpc | C++ | .C, .c, .cc, .cpp, .cxx, .c++, .i, .ii |
| ::: | icc | C | .c |
^ GCC ^ VALET command ^ Reference manuals ^ User guides ^
^ ::: |**vpkg_require gcc** | [[http://gcc.gnu.org/onlinedocs/|C]], [[http://gcc.gnu.org/onlinedocs/gcc-4.0.4/gfortran/|Fortran]] | [[http://gcc.gnu.org/onlinedocs|C]], [[http://gcc.gnu.org/onlinedocs/gcc-4.0.4/gfortran/|Fortran]] |
| ^ Compiler ^ Language ^ Common filename extensions ^
| ::: | gfortran, f95 | F77, F90, F95 | .f, .f90, .f95 |
| ::: | g++ | C++ | .C, .c, .cc, .cpp, .cxx, .c++, .i, .ii |
| ::: | gcc | C | .c |
==== Compiling serial programs ====
This section uses the PGI compiler suite to illustrate simple Fortran and C compiler commands that create an executable. For each compiler suite, you must first set the UNIX environment so the compilers and libraries are available to you. [[abstract/Caviness/app_dev/compute_env#using-valet-and-your-unix-environment|VALET]] commands provide a simple way to do this.
The examples below show the compile and link steps in a single command. These illustrations use source code files named fdriver.f90 (Fortran 90) or cdriver.c (C). They all use the **-o** option to produce an executable named 'driver.' The optional **-fpic** PGI compiler flag generates position-independent code and creates smaller executables. You might also use code optimization option flags such as **-fast** after debugging your program.
You can use the **-c** option instead to create a **.o** object file that you would later link to other object files to create the executable.
Some people use the UNIX **make** command to compile source code. There are many good online tutorials on the [[http://www.eng.hawaii.edu/Tutor/Make|basics of using make]]. Also available is a cross-platform makefile generator, **cmake**. You can set the UNIX environment for **cmake** by typing the **vpkg_require cmake** command. ****
== Using the PGI suite to illustrate: ==
First use a VALET command to set the environment:
vpkg_require pgi
Then use that compiler suite's commands:
== Fortran 90 example: ==
pgfortran -fpic fdriver.f90 -o driver
== C example: ==
pgcc -fpic cdriver.c -o driver
==== Compiling parallel programs that use OpenMP ====
If your program only uses OpenMP directives, has __no__ message passing, and your target is a single SMP node, you should add the [[https://www.openmp.org/resources/openmp-compilers-tools/|OpenMP]] compiler flag to the serial compiler flags.
^ Compiler suite ^ OpenMP compiler flag ^
| **PGI** | -mp |
| **Open64** | -mp |
| **Intel** | -openmp |
| **Intel-2016** | -qopenmp |
| **GCC** | -fopenmp |
\\ Instead of using OpenMP directives in your program, you can add an OpenMP-based library. You will still need the OpenMP compiler flag when you use the library.
==== Compiling parallel programs that use MPI ====
=== MPI implementations ===
In the distributed-memory model, the [[file://localhost/tutorials/mpi|message passing interface]] (%%MPI%%) allows programs to communicate between processors that use their own node's memory address space.** **It is the most commonly used library and runtime environment for building and executing distributed-memory applications on clusters of computers.
**OpenMPI is the most desirable MPI implementation to use**.** **It is the only one that works for job suspension, checkpointing, and task migration to other processors. These capabilities are needed to enable opportunistic use of idle nodes as well as to configure short-term and long-term queues.
Some software comes packaged with other MPI implementations that IT cannot change. In those cases, their VALET configuration files use the bundled MPI implementation. However, we recommend that you use OpenMPI whenever you need an MPI implementation.
=== MPI compiler wrappers ===
The [[http://www.openmpi.org/|OpenMPI]] implementation provides OpenMPI library compilers for C, C++, Fortran 77, 90, and 95. These //compiler wrappers// add MPI support to the actual compiler suites by passing additional information to the compiler. You simply use the MPI compiler wrapper in place of the compiler name you would normally use.
The compiler suite that's used depends on your UNIX environment settings. Use VALET commands to simultaneously set your environment to use the OpenMPI implementation and to select a particular compiler suite. The commands for the four compiler suites are:
vpkg_require openmpi/1.4.4-pgi
vpkg_require openmpi/1.4.4-open64
vpkg_require openmpi/1.4.4-intel64
vpkg_require openmpi/1.4.4-gcc
(Type// //**//vpkg_versions openmpi//**// //to see if newer versions are available//.//)
The **vpkg_require** command selects the MPI and compiler suite combination, and then you may use the compiler wrapper commands repeatedly. The wrapper name depends only on the language used, not the compiler suite you choose: mpicc (C), mpicxx or mpic++ (C++), mpi77 (Fortran 77), and mpif90 (Fortran 90 and 95).
== Fortran example: ==
vpkg_require openmpi/1.4.4-pgi
mpif90 -fpic fdriver.f90 -o driver
== C example: ==
vpkg_require openmpi/1.4.4-pgi
mpicc -fpic cdriver.c -o driver
=== ===
You may use other compiler flags listed in each [[#the-compiler-suites|compiler suite's documentation]].
To modify the options used by the MPI wrapper commands, consult the [[http://www.open-mpi.org/faq/?category=mpi-apps|FAQ section]] of the OpenMPI web site.
===== Programming libraries =====
==== Introduction ====
IT installs high-quality math and utility libraries that are used by many applications. These libraries provide highly optimized math packages and functions. To determine which compilers IT used to prepare a library version, use the **vpkg_versions** VALET command.
Here is a representative sample of installed libraries. Use the **vpkg_list**** **command to see the most current list of libraries.****
== Open-source libraries ==
* [[http://math-atlas.sourceforge.net/|ATLAS]]: Automatically Tuned Linear Algebra Software (portable)
* [[http://www.fftw.org/|FFTW]]: Discrete Fast Fourier Transform library
* [[http://www.tacc.utexas.edu/tacc-projects/gotoblas2/|GOTOBLAS2]]: Enhanced BLAS routines from the Texas Advanced Computing Center (TACC)
* [[http://www.hdfgroup.org/products/hdf4/|HDF4]] and [[http://www.hdfgroup.org/HDF5/|HDF5]]: Hierarchical Data Format suite (file formats and libraries for storing and organizing large, numerical data collections)
* [[http://acts.nersc.gov/hypre/#Documentation|HYPRE]]: High-performance preconditioners for linear system solvers (from LLNL)
* [[http://www.netlib.org/lapack|LAPACK]]: Linear algebra routines
* [[http://matplotlib.sourceforge.net/|Matplotlib]]: Python-based 2D publication-quality plotting library
* [[http://www.unidata.ucar.edu/software/netcdf/|netCDF]]: network Common Data Form for creation, access and sharing of array-oriented scientific data
* [[http://netlib.org/scalapack/scalapack_home.html|ScaLAPACK]] - Scalable LAPACK: Subset of LAPACK routines redesigned for distributed memory MIMD parallel computers using MPI
* [[http://www.vtk.org/|VTK]] – Visualization ToolKit: A platform for 3D computer graphics and visualization
== Commercial libraries ==
* [[https://developer.amd.com/amd-aocl/|AOCL]]: AMD Optimizing CPU Libraries (See [[https://developer.amd.com/wp-content/resources/57404_User_Guide_AMD_AOCL_v3.2_GA.pdf|AMD's AOCL User Guide]].) AOCL is the successor to ACML.
* [[http://www.roguewave.com/products/imsl|IMSL]]: RogueWave's mathematical and statistical libraries
* [[http://software.intel.com/en-us/articles/intel-mkl/?utm_source=google&utm_medium=cpc&utm_term=intel_mkl&utm_content=dpd_us_hpc_mkl& utm_campaign=DIV_US_DPD_%28S%29|MKL]]: Intel's Math Kernel Library
* [[http://www.nag.com/numeric/numerical_libraries.asp|NAG]]: Numerical Algorithms Group's numerical libraries
The libraries will be optimized a given cluster architecture. Note that the calling sequences of some of the commercial library routines differ from their open-source counterparts.
==== Using libraries ====
=== Introduction ===
This section shows you how to link your program with libraries you or your colleagues have created or with centrally installed libraries such as ACML or FFTW. The examples introduce special environment variables (FFLAGS, CFLAGS, CPPFLAGS and LDFLAGS) whose use simplifies a command's complexity. The VALET commands **vpkg_require** and **vpkg_devrequire** can easily define the working environment for your compiler suite choice.
Joint use of VALET and these environment variables will also prepare your UNIX environment to support your use of **make** for program development. VALET will accommodate using one or several libraries, and you can extend its functionality for software you develop or install.
==== Intel compiler suite ====
You should use Intel MKL — it's a highly-optimized BLAS/LAPACK library.
If you use the Intel compilers, you can add ''-mkl'' to your link command, e.g.
ifort -o program -mkl=sequential [...]
ifort -o program -qopenmp -mkl=parallel [...]
The former uses the serial library, the latter uses the threaded library that respects the OpenMP runtime environment of the job for multithreaded BLAS/LAPACK execution.
If you're not using the Intel compilers, you'll need to generate the appropriate compiler directives using [[https://software.intel.com/en-us/articles/intel-mkl-link-line-advisor|Intel's online tool]].
Please use "dynamic linking" since that allows MKL to adjust the underlying kernel functions at runtime according to the hardware on which you're running. If you use static linking, you're tied to the lowest common hardware model available and you will usually not see as good performance.
You'll need to load a version of Intel into the environment before compiling/building and also at runtime using VALET such as
vpkg_require intel/2019
Among other things, this will set ''MKLROOT'' in the environment to the appropriate path, which the link advisor references. The MKL version (year) matches that of the compiler version (year).
To determine the available versions of Intel installed use
$ vpkg_versions intel
==== PGI compiler suite ====
=== Fortran examples illustrated with the PGI compiler suite ===
== Reviewing the basic compilation command ==
The general command for compiling source code:
</compiler//>> </compiler_flags//>> </source_code_filename//>> -o </executable_filename//>>
For example:
vpkg_require pgi
pgfortran -fpic fdriver.f90 -o driver
== Using user-supplied libraries ==
To compile fdriver.f90 and link it to a shared F90 library named lib**//fstat//**.so stored in $HOME/lib, add the library location and the library name (**//fstat//**) to the command:
pgfortran -fast -fpic -L$HOME/lib fdriver.f90 -lfstat -o driver
The **-L** option flag is for the shared library directory's name; the **-l** flag is for the specific library name.
You can simplify this compiler command by creating and exporting two special environment variables. FFLAGS represents a set of Fortran compiler option flags; LDFLAGS represents the location and choice of your library.
vpkg_require pgi
export FFLAGS='-fpic'
export LDFLAGS='-L$HOME/lib'
export LDLIBS='-lfstat'
pgfortran $FFLAGS $LDFLAGS fdriver.f90 $LDLIBS -o driver
Extending this further, you might have several libraries in one or more locations. In that case, list all of the '-l' flags in the LDLIBS statement, for example,
export LDLIBS='-lfstat -lfpoly'
and all of the '-L' flags in the LDFLAGS statement. (The order in which the '-L' directories appear in LDFLAGS determines the search order.)
== Using centrally supplied libraries (ACML, MKL, FFTW, etc.) ==
This extends the previous section's example by illustrating how to use VALET's **vpkg_****devrequire** command to locate and link a centrally supplied library such as AMD's Core Math Library, ACML. Several releases (versions) of a library may be installed, and some may have been compiled with several compiler suites.
To view your choices, use VALET's **vpkg_versions** command:
vpkg_versions acml
The example below uses the acml/5.0.0-pgi-fma4 version, the single-threaded, ACML 5.0.0 FMA4 library compiled with the PGI 11 compilers. Since that version depends on the PGI 11 compiler suite,
vpkg_devrequire acml/5.0.0-pgi-fma4 pgi
jointly sets the UNIX environment for both ACML and the PGI compiler suite. Therefore, you __should not also issue__ a **vpkg_require pgi** command.
Unlike **vpkg_require**, **vpkg_devrequire** also modifies key environment variables including LDFLAGS.
Putting it all together, the complete example using the library named **acml** is:
vpkg_devrequire acml/5.0.0-pgi-fma4 pgi
export FFLAGS='-fpic'
export LDLIBS='-lacml'
pgfortran $FFLAGS $LDFLAGS fdriver.f90 $LDLIBS -o driver
Note that **$LDFLAGS** must be in the compile statement but does not need an explicit **export** command here. The **vpkg_devrequire** command above defined and exported LDFLAGS and its value.
== Using user-supplied libraries and centrally supplied libraries together ==
This final example illustrates how to use your **fstat** and **fpoly** libraries (both in $HOME/lib) with the acml5.0.0 library:
vpkg_devrequire acml/5.0.0-pgi-fma4 pgi
export FFLAGS='-fpic'
export LDFLAGS='-L$HOME/lib $LDFLAGS'
export LDLIBS='-lacml -lfstat -lfpoly'
pgfortran $FFLAGS $LDFLAGS fdriver.f90 $LDLIBS -o driver
Remember that the library search order depends on the order of the LDFLAGS libraries.
=== C examples illustrated with the PGI compiler suite ===
== Reviewing the basic compilation command ==
The general command for compiling source code:
</compiler//>> </compiler_flags//>> </source_code_filename//>> -o </executable_filename//>>
For example,
vpkg_require pgi
pgcc -fpic cdriver.c -o driver
== Using user-supplied libraries ==
To compile cdriver.c and link it to a shared C library named lib**//cstat//**.so stored in $HOME/lib and include header files in $HOME/inc, add the library location and the library name (**//cstat//**) to the command.
pgcc -fpic –I$HOME/inc –L$HOME/lib cdriver.c –lcstat -o driver
The **-I** option flag is for the include library's location; the **-L** flag is for the shared library directory's name; and the **-l** flag is for the specific library name.
You can simplify this compiler command by creating and exporting two special environment variables. CFLAGS represents a set of C compiler option flags; CPPFLAGS represents the C++ preprocessor flags; and LDFLAGS represents the location and choice of your shared library.
pkg_require pgi
export CFLAGS='-fpic'
export CPPFLAGS='$HOME/inc'
export LDFLAGS='-L$HOME/lib'
export LDLIBS='-lcstat'
pgcc $CFLAGS $CPPFLAGS $LDFLAGS cdriver.c $LDLIBS -o driver
Extending this further, you might have several libraries in one or more locations. In that case, list all of the '-l' flags in the LDLIBS statement, for example,
export LDLIBS='-lcstat -lcpoly'
and all of the '-L' flags in the LDFLAGS statement. (The order in which the '-L' directories appear in LDFLAGS determines the search order.)
== Using centrally supplied libraries (ACML, MKL, FFTW, etc.) ==
This extends the previous section's example by illustrating how to use VALET's **vpkg_devrequire** command to locate and link a system-supplied library, such as AMD's Core Math Library, ACML. Several releases (versions) of a library may be installed, and some may have been compiled with several compiler suites.
To view your choices, use VALET's **vpkg_versions** command:
vpkg_versions acml
The example below uses the acml/5.0.0-pgi vpkg_devrequire acml/5.0.0-pgi-fma4 version, the single-threaded, ACML 5.0.0 FMA4 library compiled with the PGI 11 compilers. Since that version depends on the PGI 11 compiler suite,****
vpkg_devrequire acml/5.0.0-pgi-fma4 pgi
jointly sets the UNIX environment for both ACML and the PGI compiler suite. Therefore, you __should not also issue__ a **vpkg_require pgi** command.
Unlike **vpkg_require**,** ****vpkg_devrequire** also modifies key environment variables including LDFLAGS and CPPFLAGS.
Putting it all together, the complete example using the library named **acml**, is:
vpkg_devrequire acml/5.0.0-pgi-fma4 pgi
export CFLAGS='-fpic'
export LDLIBS='-lacml'
pgcc $CFLAGS $CPPFLAGS $LDFLAGS cdriver.c $LDLIBS -o driver
Note that, $CPPFLAGS and $LDFLAGS must be in the compile statement even though the **export CPPFLAGS** and **export LDFLAGS** statement didn't appear above. The **vpkg_devrequire** command above defined and exported CPPFLAGS and LDFLAGS and their values.
== Using user-supplied libraries and centrally supplied libraries together ==
The final example illustrates how to use your **cstat **and **cpoly** libraries (both in $HOME/lib) with the **acml** library:
vpkg_devrequire acml/5.0.0-pgi-fma4 pgi
export CFLAGS='-fpic'
export CPPFLAGS='$CPPFLAGS $HOME/inc'
export LDFLAGS='-L$HOME/lib $LDFLAGS'
export LDLIBS='-lacml -lcstat -lcpoly'
pgcc $CFLAGS $CPPFLAGS $LDFLAGS cdriver.c $LDLIBS -o driver
Remember that the library search order depends on the order of the LDFLAGS libraries.