software:hpcc:farber

HPCC with intel compiler, MKL and base FFT

Start by downloading and extracting the hpcc-1.4.3 directory:

curl -s http://icl.cs.utk.edu/projectsfiles/hpcc/download/hpcc-1.4.3.tar.gz | tar zx

The hpcc-1.4.3 directory will have all the files you need to run the benchmark. Our job is to modify the setup for intel, and mkl on farber which uses VALET.

Copy the make file hpl/setup/Make.Linux_ATHLON_FBLAS to hpl/Make.intel-mkl

  1. Comment lines beginning in MP or LA
  2. Change /usr/bin/gcc to mpicc
  3. Change /usr/bin/g77 to mpif77
  4. Change CCFLAGS to -mkl -O3 -fno-alias -DHPCC_FFT_235
  5. Change LINGFLAGS to -mkl -nofor-main

The Valet commands are

vpkg_devrequire intel
vpkg_devrequire openmpi/1.8.2-intel64

Exported variables (to set values for commented LAinc and LAlib)

export LAinc="$CPPFLAGS"
export LAlib="$LDFLAGS -nofor-main"

Make command with 4 threads

make -j 4 arch=intel-mkl
package `intel/2015.0.090` 
package `openmpi/1.8.2-intel64` 
N = 30000, NB = 200, P = 5 Q = 8

These runs need 40 processes (5 per row and 8 per column.) The same number of processes are run with 40 slots.

WEB NAME VALUE UNITS
G-HPL 0.6201 TeraFlops/Sec
G-PTRANS 0.0127 TeraBytes/Sec
G-RandomAccess 0.0789 GigaUpdates/Sec
G-FFT 0.0222 TeraFlops/Sec
EP-STREAM Sys 0.1638 TeraBytes/Sec
EP-STREAM Triad 4.0951 GigaBytes/Sec
EP-DGEMM 14.7898 GigaFlops/Sec
RandomRing Bandwidth 0.5619 GigaBytes/Sec
RandomRing Latency 2.1133 micro-seconds

qacct values:

ru_wallclock 130.716      
ru_utime     5114.084     
ru_stime     39.425       
maxvmem      25.174G
package `acml/5.3.0-open64-fma4` 
package `open64/4.5` 
package `openmpi/1.4.4-open64` or  `openmpi/1.6.1-open64`
N = 72000, NB = 100, P = 12, Q = 16
nproc = 2x192   (384 slots with 192 MPI workers bound to a bulldozer core pair)

Two runs mostly differ by the use of Qlogic PSM endpoints

Result ^PSM (v1.4.4) PSM (v1.6.1)
HPL_Tflops 1.68496 2.08056
StarDGEMM_Gflops 14.6933 14.8339
SingleDGEMM_Gflops 15.642 15.536
PTRANS_GBs 9.25899 18.4793
StarFFT_Gflops 1.19982 1.25452
StarSTREAM_Triad 3.62601 3.65631
SingleFFT_Gflops 1.44111 1.44416
MPIFFT_Gflops 7.67835 77.603
RandomlyOrderedRingLatency_usec 65.8478 2.44898
  • software/hpcc/farber.txt
  • Last modified: 2018-05-08 13:28
  • by sraskar