Differences
This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision | ||
software:lapack:caviness [2020-03-06 17:44] – [Compiling with intel and mkl library] anita | software:lapack:caviness [2021-04-27 16:21] (current) – external edit 127.0.0.1 | ||
---|---|---|---|
Line 25: | Line 25: | ||
already have.</ | already have.</ | ||
- | ===== Compiling with intel and mkl library ===== | + | ===== Compiling with Intel and MKL library ===== |
The [[https:// | The [[https:// | ||
Line 41: | Line 41: | ||
==== VALET and ifort ==== | ==== VALET and ifort ==== | ||
- | Assuming you have the **'' | + | Assuming you have the **'' |
- | compile the source file to an executable that links with the MKL library. | + | compile the source file to an executable that links with the MKL library. Remember VALET will choose the default version of the Intel Compiler Suite, if you do not specify a version. |
< | < | ||
- | vpkg_devrequire intel/ | + | workgroup -g << |
+ | vpkg_devrequire intel | ||
ifort -mkl dgels-ex.f -o dgels-ex | ifort -mkl dgels-ex.f -o dgels-ex | ||
</ | </ | ||
Line 64: | Line 65: | ||
<note tip> | <note tip> | ||
< | < | ||
- | vpkg_devrequire intel/ | + | vpkg_devrequire intel |
export FC=ifort | export FC=ifort | ||
export FFLAGS=-mkl | export FFLAGS=-mkl | ||
Line 70: | Line 71: | ||
</ | </ | ||
</ | </ | ||
- | ==== qsub file to test ==== | + | ==== sbatch |
- | The '' | + | The '' |
<file bash test.qs> | <file bash test.qs> | ||
- | #$ -N dgels-ex | + | #!/ |
- | #$ -pe threads 4 | + | # |
+ | # Sections of this script that can/should be edited are delimited by a | ||
+ | # [EDIT] tag. All Slurm job options are denoted by a line that starts | ||
+ | # with "# | ||
+ | # the command line. Slurm job options can easily be disabled in a | ||
+ | # script by inserting a space in the prefix, e.g. "# SLURM " and | ||
+ | # reenabled by deleting that space. | ||
+ | # | ||
+ | # This is a batch job template for a program using multiple processor | ||
+ | # cores/ | ||
+ | # parallelism or explicit threading via the pthreads library. | ||
+ | # | ||
+ | # Do not alter the --nodes/ | ||
+ | #SBATCH | ||
+ | #SBATCH --ntasks=1 | ||
+ | # | ||
+ | # [EDIT] Indicate the number of processor cores/threads | ||
+ | # by the job: | ||
+ | # | ||
+ | #SBATCH --cpus-per-task=4 | ||
+ | # | ||
+ | # [EDIT] All jobs have memory limits imposed. | ||
+ | # CPU allocated to the job. The default can be overridden either | ||
+ | # with a per-node value (--mem) or a per-CPU value (--mem-per-cpu) | ||
+ | # with unitless values in MB and the suffixes K|M|G|T denoting | ||
+ | # kibi, mebi, gibi, and tebibyte units. | ||
+ | # the "#" | ||
+ | # | ||
+ | # SBATCH --mem=8G | ||
+ | # SBATCH --mem-per-cpu=1024M | ||
+ | # | ||
+ | # .... more options not used .... | ||
+ | # | ||
+ | # [EDIT] It can be helpful to provide a descriptive (terse) name for | ||
+ | # the job (be sure to use quotes if there' | ||
+ | # name): | ||
+ | # | ||
+ | #SBATCH --job-name=dgels-ex | ||
+ | # | ||
+ | # [EDIT] The partition determines which nodes can be used and with what | ||
+ | # maximum runtime limits, etc. Partition limits can be displayed | ||
+ | # with the "sinfo --summarize" | ||
+ | # | ||
+ | # SBATCH --partition=standard | ||
+ | # | ||
+ | # To run with priority-access to resources owned by your workgroup, | ||
+ | # use the " | ||
+ | # | ||
+ | #SBATCH --partition=_workgroup_ | ||
+ | # | ||
+ | # [EDIT] The maximum runtime for the job; a single integer is interpreted | ||
+ | # as a number of minutes, otherwise use the format | ||
+ | # | ||
+ | # d-hh: | ||
+ | # | ||
+ | # Jobs default to the default runtime limit of the chosen partition | ||
+ | # if this option is omitted. | ||
+ | # | ||
+ | #SBATCH --time=0-02: | ||
+ | # | ||
+ | # You can also provide a minimum acceptable runtime so the scheduler | ||
+ | # may be able to run your job sooner. | ||
+ | # value, it will be set to match the maximum runtime limit (discussed | ||
+ | # above). | ||
+ | # | ||
+ | # SBATCH --time-min=0-01: | ||
+ | # | ||
+ | # .... more options not used .... | ||
+ | # | ||
+ | # Do standard OpenMP environment setup: | ||
+ | # | ||
+ | . / | ||
+ | |||
+ | # | ||
+ | # [EDIT] Execute your OpenMP/ | ||
+ | # | ||
echo "--- Set environment ---" | echo "--- Set environment ---" | ||
- | source / | + | vpkg_require intel |
- | vpkg_require intel/ | + | |
echo "" | echo "" | ||
- | echo "--- Run Test with $NSLOTS | + | echo "--- Run Test with $SLURM_CPUS_PER_TASK |
- | export MKL_NUM_THREADS=$NSLOTS | + | export MKL_NUM_THREADS=$SLURM_CPUS_PER_TASK |
- | time ./$JOB_NAME | + | time ./$SLURM_JOB_NAME |
echo "" | echo "" | ||
echo "--- Compare Results ---" | echo "--- Compare Results ---" | ||
- | cat $JOB_NAME.r | + | cat $SLURM_JOB_NAME.r |
</ | </ | ||
==== Test result output ==== | ==== Test result output ==== | ||
< | < | ||
+ | [traine@login01 nagex]$ workgroup -g it_css | ||
+ | [(it_css: | ||
+ | Submitted batch job 6718859 | ||
+ | [(it_css: | ||
+ | -- OpenMP job setup complete: | ||
+ | -- OMP_THREAD_LIMIT | ||
+ | -- OMP_PROC_BIND | ||
+ | -- OMP_PLACES | ||
+ | -- MP_BLIST | ||
+ | |||
--- Set environment --- | --- Set environment --- | ||
- | WARNING: ' | + | Adding package `intel/2018u4` to your environment |
- | Adding package `intel/2013-2.144-64bit` to your environment | + | |
--- Run Test with 4 threads --- | --- Run Test with 4 threads --- | ||
DGELS Example Program Results | DGELS Example Program Results | ||
- | + | ||
Least squares solution | Least squares solution | ||
1.5339 | 1.5339 | ||
- | + | ||
| | ||
2.22E-02 | 2.22E-02 | ||
- | real 0m0.966s | + | real 0m1.043s |
- | user 0m0.003s | + | user 0m0.007s |
- | sys 0m0.031s | + | sys |
--- Compare Results --- | --- Compare Results --- | ||
Line 119: | Line 203: | ||
2.22E-02 | 2.22E-02 | ||
</ | </ | ||
- | |||
- | <note warning> | ||
- | that GNU **'' | ||
- | |||
- | If your are debugging on the compute nodes or want to remove the warning, add | ||
- | < | ||
- | vpkg_require gcc/4.6 | ||
- | </ | ||
- | before the intel '' | ||
- | </ | ||
<note important> | <note important> | ||
Line 138: | Line 212: | ||
* Programs with small arrays will not benefit from the multi-threaded library, and may suffer a bit from the system overhead of maintaining multiple threads. | * Programs with small arrays will not benefit from the multi-threaded library, and may suffer a bit from the system overhead of maintaining multiple threads. | ||
- | * Sequential programs are better suited for running simultaneous instances. | + | * Sequential programs are better suited for running simultaneous instances. |
* You may be able to take control of the parallelism in your program with OPENMP compiler directions. | * You may be able to take control of the parallelism in your program with OPENMP compiler directions. | ||
- | ===== Compiling with PGI and ACML library ===== | ||
- | |||
- | The [[http:// | ||
- | |||
- | <note tip> | ||
- | From the release notes in the file ''/ | ||
- | New features of release 5.3.0 of ACML | ||
- | | ||
- | </ | ||
| |