Differences
This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision | ||
software:lapack:caviness [2020-03-05 14:24] – [Compiling with intel and mkl library] anita | software:lapack:caviness [2021-04-27 16:21] (current) – external edit 127.0.0.1 | ||
---|---|---|---|
Line 25: | Line 25: | ||
already have.</ | already have.</ | ||
- | ===== Compiling with intel and mkl library ===== | + | ===== Compiling with Intel and MKL library ===== |
- | The [[https:// | + | The [[https:// |
<note tip> | <note tip> | ||
- | You can get the Package name and infer the update number from VALET, but you may also need the version of the compiler and the version of the LAPACK interfaces supported in the MKL component of the package. | + | You can get the Package name and infer the update number from VALET, but you may also need the version of the compiler and the version of the LAPACK interfaces supported in the MKL component of the package. |
| | ||
Line 41: | Line 41: | ||
==== VALET and ifort ==== | ==== VALET and ifort ==== | ||
- | Assuming you have the **'' | + | Assuming you have the **'' |
- | compile the source file to an executable that links with the MKL library. | + | compile the source file to an executable that links with the MKL library. Remember VALET will choose the default version of the Intel Compiler Suite, if you do not specify a version. |
< | < | ||
- | vpkg_devrequire intel/ | + | workgroup -g << |
+ | vpkg_devrequire intel | ||
ifort -mkl dgels-ex.f -o dgels-ex | ifort -mkl dgels-ex.f -o dgels-ex | ||
</ | </ | ||
Line 64: | Line 65: | ||
<note tip> | <note tip> | ||
< | < | ||
- | vpkg_devrequire intel/ | + | vpkg_devrequire intel |
export FC=ifort | export FC=ifort | ||
export FFLAGS=-mkl | export FFLAGS=-mkl | ||
Line 70: | Line 71: | ||
</ | </ | ||
</ | </ | ||
- | ==== qsub file to test ==== | + | ==== sbatch |
- | The '' | + | The '' |
<file bash test.qs> | <file bash test.qs> | ||
- | #$ -N dgels-ex | + | #!/ |
- | #$ -pe threads 4 | + | # |
+ | # Sections of this script that can/should be edited are delimited by a | ||
+ | # [EDIT] tag. All Slurm job options are denoted by a line that starts | ||
+ | # with "# | ||
+ | # the command line. Slurm job options can easily be disabled in a | ||
+ | # script by inserting a space in the prefix, e.g. "# SLURM " and | ||
+ | # reenabled by deleting that space. | ||
+ | # | ||
+ | # This is a batch job template for a program using multiple processor | ||
+ | # cores/ | ||
+ | # parallelism or explicit threading via the pthreads library. | ||
+ | # | ||
+ | # Do not alter the --nodes/ | ||
+ | #SBATCH | ||
+ | #SBATCH --ntasks=1 | ||
+ | # | ||
+ | # [EDIT] Indicate the number of processor cores/threads | ||
+ | # by the job: | ||
+ | # | ||
+ | #SBATCH --cpus-per-task=4 | ||
+ | # | ||
+ | # [EDIT] All jobs have memory limits imposed. | ||
+ | # CPU allocated to the job. The default can be overridden either | ||
+ | # with a per-node value (--mem) or a per-CPU value (--mem-per-cpu) | ||
+ | # with unitless values in MB and the suffixes K|M|G|T denoting | ||
+ | # kibi, mebi, gibi, and tebibyte units. | ||
+ | # the "#" | ||
+ | # | ||
+ | # SBATCH --mem=8G | ||
+ | # SBATCH --mem-per-cpu=1024M | ||
+ | # | ||
+ | # .... more options not used .... | ||
+ | # | ||
+ | # [EDIT] It can be helpful to provide a descriptive (terse) name for | ||
+ | # the job (be sure to use quotes if there' | ||
+ | # name): | ||
+ | # | ||
+ | #SBATCH --job-name=dgels-ex | ||
+ | # | ||
+ | # [EDIT] The partition determines which nodes can be used and with what | ||
+ | # maximum runtime limits, etc. Partition limits can be displayed | ||
+ | # with the "sinfo --summarize" | ||
+ | # | ||
+ | # SBATCH --partition=standard | ||
+ | # | ||
+ | # To run with priority-access to resources owned by your workgroup, | ||
+ | # use the " | ||
+ | # | ||
+ | #SBATCH --partition=_workgroup_ | ||
+ | # | ||
+ | # [EDIT] The maximum runtime for the job; a single integer is interpreted | ||
+ | # as a number of minutes, otherwise use the format | ||
+ | # | ||
+ | # d-hh: | ||
+ | # | ||
+ | # Jobs default to the default runtime limit of the chosen partition | ||
+ | # if this option is omitted. | ||
+ | # | ||
+ | #SBATCH --time=0-02: | ||
+ | # | ||
+ | # You can also provide a minimum acceptable runtime so the scheduler | ||
+ | # may be able to run your job sooner. | ||
+ | # value, it will be set to match the maximum runtime limit (discussed | ||
+ | # above). | ||
+ | # | ||
+ | # SBATCH --time-min=0-01: | ||
+ | # | ||
+ | # .... more options not used .... | ||
+ | # | ||
+ | # Do standard OpenMP environment setup: | ||
+ | # | ||
+ | . / | ||
+ | |||
+ | # | ||
+ | # [EDIT] Execute your OpenMP/ | ||
+ | # | ||
echo "--- Set environment ---" | echo "--- Set environment ---" | ||
- | source / | + | vpkg_require intel |
- | vpkg_require intel/ | + | |
echo "" | echo "" | ||
- | echo "--- Run Test with $NSLOTS | + | echo "--- Run Test with $SLURM_CPUS_PER_TASK |
- | export MKL_NUM_THREADS=$NSLOTS | + | export MKL_NUM_THREADS=$SLURM_CPUS_PER_TASK |
- | time ./$JOB_NAME | + | time ./$SLURM_JOB_NAME |
echo "" | echo "" | ||
echo "--- Compare Results ---" | echo "--- Compare Results ---" | ||
- | cat $JOB_NAME.r | + | cat $SLURM_JOB_NAME.r |
</ | </ | ||
==== Test result output ==== | ==== Test result output ==== | ||
< | < | ||
+ | [traine@login01 nagex]$ workgroup -g it_css | ||
+ | [(it_css: | ||
+ | Submitted batch job 6718859 | ||
+ | [(it_css: | ||
+ | -- OpenMP job setup complete: | ||
+ | -- OMP_THREAD_LIMIT | ||
+ | -- OMP_PROC_BIND | ||
+ | -- OMP_PLACES | ||
+ | -- MP_BLIST | ||
+ | |||
--- Set environment --- | --- Set environment --- | ||
- | WARNING: ' | + | Adding package `intel/2018u4` to your environment |
- | Adding package `intel/2013-2.144-64bit` to your environment | + | |
--- Run Test with 4 threads --- | --- Run Test with 4 threads --- | ||
DGELS Example Program Results | DGELS Example Program Results | ||
- | + | ||
Least squares solution | Least squares solution | ||
1.5339 | 1.5339 | ||
- | + | ||
| | ||
2.22E-02 | 2.22E-02 | ||
- | real 0m0.966s | + | real 0m1.043s |
- | user 0m0.003s | + | user 0m0.007s |
- | sys 0m0.031s | + | sys |
--- Compare Results --- | --- Compare Results --- | ||
Line 119: | Line 203: | ||
2.22E-02 | 2.22E-02 | ||
</ | </ | ||
- | |||
- | <note warning> | ||
- | that GNU **'' | ||
- | |||
- | If your are debugging on the compute nodes or want to remove the warning, add | ||
- | < | ||
- | vpkg_require gcc/4.6 | ||
- | </ | ||
- | before the intel '' | ||
- | </ | ||
<note important> | <note important> | ||
Line 138: | Line 212: | ||
* Programs with small arrays will not benefit from the multi-threaded library, and may suffer a bit from the system overhead of maintaining multiple threads. | * Programs with small arrays will not benefit from the multi-threaded library, and may suffer a bit from the system overhead of maintaining multiple threads. | ||
- | * Sequential programs are better suited for running simultaneous instances. | + | * Sequential programs are better suited for running simultaneous instances. |
* You may be able to take control of the parallelism in your program with OPENMP compiler directions. | * You may be able to take control of the parallelism in your program with OPENMP compiler directions. | ||
- | ===== Compiling with PGI and ACML library ===== | ||
- | |||
- | The [[http:// | ||
- | |||
- | <note tip> | ||
- | From the release notes in the file ''/ | ||
- | New features of release 5.3.0 of ACML | ||
- | | ||
- | </ | ||
| |