Differences
This shows you the differences between two versions of the page.
Both sides previous revision Previous revision | Next revisionBoth sides next revision | ||
technical:whitepaper:start [2021-02-12 16:01] – frey | technical:whitepaper:start [2022-03-28 15:40] – frey | ||
---|---|---|---|
Line 1: | Line 1: | ||
- | ====== White Papers ====== | ||
- | |||
- | {{ : | ||
- | |||
- | ===== Mellanox UCX and Open MPI on DARWIN ===== | ||
- | |||
- | During early-access testing of the DARWIN cluster several users reported issues with their MPI jobs' crashing unexpectedly in code locations that worked on previous clusters (like Caviness). | ||
- | |||
- | ===== /dev/shm exhaustion ===== | ||
- | |||
- | As time goes by, the ''/ | ||
- | |||
- | ===== R: runtime configurable BLAS/LAPACK ===== | ||
- | |||
- | The R statistical computing software can be built atop a variety of BLAS and LAPACK libraries -- including its own internal //Rblas// and //Rlapack// libraries. | ||
- | |||
- | ===== Mills: threading performance study ===== | ||
- | |||
- | The behavior of the Mills cluster' | ||
- | |||
- | {{: | ||
- | |||
- | ===== Mills: AMD Opteron 6200 Unix Tuning Guide ===== | ||
- | |||
- | The Nodes on the Mills cluster have 2 or 4 AMD Opteron 6200 series sockets. | ||
- | |||
- | This technical tuning guide is intended for " | ||
- | and developers on a Linux platform who perform application development, | ||
- | system installation" | ||
- | |||
- | |||
- | [[http:// | ||
- | |||
- | ===== HPC Challenge Awards Competition at SC Conference ===== | ||
- | |||
- | The SC((The International Conference for High Performance Computing, Networking, Storage and Analysis)) High Performance Computing Challenge includes the benchmarks: | ||
- | |||
- | - HPL measures the floating point rate of execution for solving a linear system of equations | ||
- | - DGEMM measures the floating point rate of execution of double precision real matrix matrix multiplication | ||
- | - STREAM measures sustainable memory bandwidth (in GB/s) and the corresponding computation rate for a simple vector kernel | ||
- | - PTRANS (parallel matrix transpose) exercises communications between pairs of processors. It is a useful test of the total communications capacity of the network. | ||
- | - Random Access measures the rate of integer random updates of memory (GUPS) | ||
- | - FFT measures the floating point rate of execution of double precision complex one dimensional Discrete Fourier Transform (DFT) | ||
- | - Communication bandwidth and latency measures latency and bandwidth of a number of simultaneous communication patterns; based on b_eff (effective bandwidth benchmark). | ||
- | |||
- | [[http:// | ||
- | |||
- | ===== Matlab: Computational threads on a shared cluster ===== | ||
- | |||
- | By default Matlab uses multiple computational threads for standard linear algebra calculations. | ||
- | |||
- | To fully use the computational threads you must call the built in high level functions or data parallel constructs in Matlab. | ||
- | |||
- | |||
- | |||
- | ===== Mills: Using ACML In High Performance Computing Challenge ===== | ||
- | |||
- | For Mills, the recommended libraries include OpenMPI, ACML, and FFTW. The AMD recommended compilers include Open64 and PGI. | ||
- | The following document from AMD includes instructions for installing these libraries, but this is not needed on Mills since they are already installed as VALET packages. | ||
- | |||
- | [[http:// | ||
- | |||
- | ===== Mills: Benchmarking studies ===== | ||
- | |||
- | ==== High Performance Computing Challenge studies ==== | ||
- | |||
- | * [[hpcc open64 acml|open64 compiler with ACML and openmpi libraries]] | ||