The NAG sites has a collection of examples to test LAPACK drivers. A driver routine will call the necessary lower level routines to solve one particular problem. For example, a real linear least square problem is solved by the dgels
driver. Driver routines may not be all LAPACK libraries, but you can download drivers from netlib driver rouines. The source of the driver routine is useful for learning how to use the lower level routines.
Each example in the NAG collection has a source file, a input file, and a output result file, which should match your result.
dels-ex.f
- The Fortran 77 source fileddels-ex.d
- The input data file to be read from unit 5 (standard input)dgels-ex.r
- Should match the output on unit 6 (standard output)
You can use wget
to get these with the script:
if [ ! -f "dgels-ex.f" ]; then wget http://www.nag.com/lapack-ex/examples/source/dgels-ex.f wget http://www.nag.com/lapack-ex/examples/data/dgels-ex.d wget http://www.nag.com/lapack-ex/examples/results/dgels-ex.r else touch "dgels-ex.f" fi
wget
commands in your terminal window, but it is a good idea to save them in a file for later reference. In this case, you should enclose them in a conditional if
statement to avoid downloading a file you
already have.
The Intel Composer XE Suites comes installed with a Fortran compiler the MKL library. Use VALET, vpkg_versions intel
. to find the latest version installed on Mills - Version 2013 (2.144)
.
Update 2 - February 2014 Intel Fortran Compiler updated to 14.0.2 Intel Math Kernal Library updated to 11.1 Update 2
and the details on the main product page for MKL 11.1,
LAPACK 3.4.1 interfaces and enhancements
Assuming you have the dgels-ex.f
Fortran 77 source file, use the VALET and complile commands to
compile the source file to an executable that links with the MKL library.
vpkg_devrequire intel/14.0.2-64bit ifort -mkl dgels-ex.f -o dgels-ex
The ifort
compiler has an -mkl
optimization flag, and from the man page or ifort –help
-mkl[=<arg>] link to the Intel(R) Math Kernel Library (Intel(R) MKL) and bring in the associated headers parallel - link using the threaded Intel(R) MKL libraries. This is the default when -mkl is specified sequential - link using the non-threaded Intel(R) MKL libraries cluster - link using the Intel(R) MKL Cluster libraries plus the sequential Intel(R) MKL libraries
vpkg_devrequire intel/14.0.2-64bit export FC=ifort export FFLAGS=-mkl make dgels-ex
The ifort
compiler with flag -mkl
will compile and link to the threaded MKL libraries. Thus you should test in the threaded parallel environment, and export the number of slots to the MKL_NUM_THREAD
environment variable.
#$ -N dgels-ex #$ -pe threads 4 echo "--- Set environment ---" source /opt/shared/valet/docs/valet.sh vpkg_require intel/14.0.2-64bit echo "" echo "--- Run Test with $NSLOTS threads ---" export MKL_NUM_THREADS=$NSLOTS time ./$JOB_NAME < $JOB_NAME.d echo "" echo "--- Compare Results ---" cat $JOB_NAME.r
--- Set environment --- WARNING: 'gcc' was not found Adding package `intel/2013-2.144-64bit` to your environment --- Run Test with 4 threads --- DGELS Example Program Results Least squares solution 1.5339 1.8707 -1.5241 0.0392 Square root of the residual sum of squares 2.22E-02 real 0m0.966s user 0m0.003s sys 0m0.031s --- Compare Results --- DGELS Example Program Results Least squares solution 1.5339 1.8707 -1.5241 0.0392 Square root of the residual sum of squares 2.22E-02
gdk
is included for debugging, which requires gcc
. This warning can be ignored if you are not debugging with gdk
in the batch script. You will not get this warning on the head node, since the system version of gcc
will always be found in your path.
If your are debugging on the compute nodes or want to remove the warning, add
vpkg_require gcc/4.6
before the intel vpkg_require
command in you batch script file.
dgels
, and reproduce the correct results.
This example used the default parallel MKL libraries. The LAPACK library is a collection of routines, which parallelize nicely (for large problems), and MKL is an optimized multi-threaded library. For large probrems you get the best performance with the default. However, there are three important considerations when using MKL.
The AMD core math library (ACML) is from AMD developers, and is thus a good chioce form the Mills chip set. Use VALET, vpkg_versions acml
. to find the latest version installed on Mills - 5.3.0
.
/opt/shared/ACML/5.3.0/ReleaseNotes
New features of release 5.3.0 of ACML Updated the LAPACK code to version 3.4.0.