technical:whitepaper:r-runtime-blas-lapack

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Next revision
Previous revision
Next revisionBoth sides next revision
technical:whitepaper:r-runtime-blas-lapack [2018-12-10 11:00] – created freytechnical:whitepaper:r-runtime-blas-lapack [2018-12-10 11:55] frey
Line 1: Line 1:
 ====== R: Runtime-configuration BLAS/LAPACK ====== ====== R: Runtime-configuration BLAS/LAPACK ======
  
 +The R Project for Statistical Computing is used on our clusters by a wide variety of scientific disciplines.  Though the breadth of applications is wide, many of them require the functionality of BLAS/LAPACK libraries.  R provides its own baseline implementations that will build on any system; naturally, one cannot expect these BLAS/LAPACK libraries to be highly performant relative to implementations like:
 +  * Intel Math Kernel Library (MKL)
 +  * Automatically-Tuned Linear Algebra Software (ATLAS)
 +The build procedure for R allows the package to be configured for building against external BLAS/LAPACK libraries.  Once the base R build has completed and the resulting software has been installed, additional R libraries can be configured and installed atop it.  It has been noted in the past that:
 +  - Producing //N// such builds of R that vary only in the choice of underlying BLAS/LAPACK:
 +    * can require on the order of //N// times the disk space of a single build
 +    * puts a greater burden on the sysadmin to maintain all //N// similarly-outfitted copies
 +  - R only makes use of standardized BLAS/LAPACK APIs, so any standard BLAS/LAPACK library should be able to be chosen at runtime (not just build time)/
 +
 +===== Substituting an alternate library =====
 +
 +Others have published articles in the past detailing the substitution of the ATLAS library by doing the following to a basic R build (which was built with its bundled BLAS/LAPACK):
 +  * [[https://www.r-bloggers.com/r-r-with-atlas-r-with-openblas-and-revolution-r-open-which-is-fastest/]]
 +  * [[https://stackoverflow.com/questions/29984141/does-installing-blas-atlas-mkl-openblas-will-speed-up-r-package-that-is-written]]
 +The basic idea is:
 +  * copy ''libatlas.so'' to ''**R_PREFIX**/lib64/R/lib''
 +  * remove ''libRblas.so'' and ''libRlapack.so'' from ''**R_PREFIX**/lib64/R/lib''
 +  * symlink ''libRblas.so'' and ''libRlapack.so'' to ''libatlas.so'' in ''**R_PREFIX**/lib64/R/lib''
 +This copy of R is configured to use ''**R_PREFIX**/lib64/R/lib'' to resolve shared libraries, so when executing the ''R'' command, for example, the symlinks will lead the runtime linker to the ATLAS library when resolving BLAS/LAPACK functions.
 +
 +This scheme requires two things:
 +  - the user must have ownership of the R installation or sufficient privileges to alter the files
 +  - the BLAS/LAPACK substitution will happen on time only (probably shortly after the library is built)
 +While the first condition is obvious, the second may not seem important, especially for a build of R being maintained by an arbitrary user in an arbitrary location on the filesystem.  However, computational reproducibility would demand that any alteration to the underlying BLAS/LAPACK be present -- or at least able to be restored -- at any time.  This is one reason why ''libatlas.so'' was copied into the build and symlinks were used:  having other BLAS/LAPACK libraries present, the ''libRblas.so'' and ''libRlapack.so'' symlinks can be altered as necessary.  The caveat, however, is that:
 +  * only a single choice of underlying BLAS/LAPACK can be active
 +  * the underlying BLAS/LAPACK can be changed only when that build of R is not being executed/used
 +
 +A simple way to organize multiple underlying BLAS/LAPACK libraries in a single R installation is to create subdirectories for each variant:
 +
 +^Path  ^Description^
 +|''**R_PREFIX**/lib64/R/lib''  |base directory where R looks for shared libraries by default|
 +|''**R_PREFIX**/lib64/R/lib/libRblas.so''  |symlink to chosen BLAS library (from one of the subdirectories herein)|
 +|''**R_PREFIX**/lib64/R/lib/libRlapack.so''  |symlink to chosen LAPACK library (from one of the subdirectories herein)|
 +|''**R_PREFIX**/lib64/R/lib/atlas''  |directory to hold ''libatlas.so''|
 +|''**R_PREFIX**/lib64/R/lib/rblas''  |directory to hold the bundled ''libRblas.so'' and ''libRlapack.so'' produced by R build procedure|
 +|''**R_PREFIX**/lib64/R/lib/mkl''  |directory to hold MKL variants|
 +|''**R_PREFIX**/lib64/R/lib/mkl/seq''  |directory to hold sequential MKL variant|
 +|''**R_PREFIX**/lib64/R/lib/mkl/thr''  |directory to hold threaded MKL variant|
 +
 +The ATLAS library contains both BLAS and LAPACK APIs in a single shared library and both the ''libRblas.so'' and ''libRatlas.so'' symlinks are pointed to it.  The Intel MKL contains both APIs, as well, but is modularized by the parallel nature of the runtime environment:  sequential (non-threaded) or OpenMP (multithreaded).  Our solution is to build a shim library linked to the appropriate Intel libraries.
 +
 +==== Sequential MKL shim ====
 +
 +A C source file containing a dummy function was created in ''**R_PREFIX**/lib64/R/lib/mkl/seq'':
 +
 +<file C shim.c>
 +int
 +mkl_shim_dummy(void)
 +{
 + return 0;
 +}
 +</file>
 +
 +The shim library is then created thusly:
 +
 +<code bash>
 +$ cd ${R_PREFIX}/lib64/R/lib/mkl/seq
 +$ icc -shared -o libRblas.so -mkl=sequential shim.c
 +$ ln -s libRblas.so libRlapack.so
 +</code>
 +
 +==== Threaded MKL shim ====
 +
 +A C source file containing a dummy function was created in ''**R_PREFIX**/lib64/R/lib/mkl/thr'':
 +
 +<file C shim.c>
 +int
 +mkl_shim_dummy(void)
 +{
 + return 0;
 +}
 +</file>
 +
 +Since our R build used the GNU C compiler, the threaded MKL variant only works if the shim library is built against the GNU OpenMP runtime.  Using just "-mkl=parallel" links against the Intel OpenMP runtime which in testing yielded numerical issues (not actual crashes).  The shim library is then created thusly:
 +
 +<code bash>
 +$ cd ${R_PREFIX}/lib64/R/lib/mkl/thr
 +$ icc -shared -o libRblas.so shim.c -lmkl_gnu_thread -lmkl_core -lmkl_intel_lp64
 +$ ln -s libRblas.so libRlapack.so
 +</code>
 +
 +===== Runtime-configurable substitution =====
  • technical/whitepaper/r-runtime-blas-lapack.txt
  • Last modified: 2018-12-10 12:40
  • by frey