R: Runtime-configuration BLAS/LAPACK
The R Project for Statistical Computing is used on our clusters by a wide variety of scientific disciplines. Though the breadth of applications is wide, many of them require the functionality of BLAS/LAPACK libraries. R provides its own baseline implementations that will build on any system; naturally, one cannot expect these BLAS/LAPACK libraries to be highly performant relative to implementations like:
- Intel Math Kernel Library (MKL)
- Automatically-Tuned Linear Algebra Software (ATLAS)
The build procedure for R allows the package to be configured for building against external BLAS/LAPACK libraries. Once the base R build has completed and the resulting software has been installed, additional R libraries can be configured and installed atop it. It has been noted in the past that:
- Producing N such builds of R that vary only in the choice of underlying BLAS/LAPACK:
- can require on the order of N times the disk space of a single build
- puts a greater burden on the sysadmin to maintain all N similarly-outfitted copies
- R only makes use of standardized BLAS/LAPACK APIs, so any standard BLAS/LAPACK library should be able to be chosen at runtime (not just build time).
Substituting an alternate library
Others have published articles in the past detailing the substitution of the ATLAS library by doing the following to a basic R build (which was built with its bundled BLAS/LAPACK):
The basic idea is:
- copy
libatlas.so
toR_HOME/lib64/R/lib
- remove
libRblas.so
andlibRlapack.so
fromR_HOME/lib64/R/lib
- symlink
libRblas.so
andlibRlapack.so
tolibatlas.so
inR_HOME/lib64/R/lib
This copy of R is configured to use R_HOME/lib64/R/lib
to resolve shared libraries, so when executing the R
command, for example, the symlinks will lead the runtime linker to the ATLAS library when resolving BLAS/LAPACK functions.
This scheme requires two things:
- the user must have ownership of the R installation or sufficient privileges to alter the files
- the BLAS/LAPACK substitution will happen on time only (probably shortly after the library is built)
While the first condition is obvious, the second may not seem important, especially for a build of R being maintained by an arbitrary user in an arbitrary location on the filesystem. However, computational reproducibility would demand that any alteration to the underlying BLAS/LAPACK be present – or at least able to be restored – at any time. This is one reason why libatlas.so
was copied into the build and symlinks were used: having other BLAS/LAPACK libraries present, the libRblas.so
and libRlapack.so
symlinks can be altered as necessary. The caveat, however, is that:
- only a single choice of underlying BLAS/LAPACK can be active
- the underlying BLAS/LAPACK can be changed only when that build of R is not being executed/used
A simple way to organize multiple underlying BLAS/LAPACK libraries in a single R installation is to create subdirectories for each variant:
Path | Description |
---|---|
R_HOME/lib64/R/lib | base directory where R looks for shared libraries by default |
R_HOME/lib64/R/lib/libRblas.so | symlink to chosen BLAS library (from one of the subdirectories herein) |
R_HOME/lib64/R/lib/libRlapack.so | symlink to chosen LAPACK library (from one of the subdirectories herein) |
R_HOME/lib64/R/lib/atlas | directory to hold libatlas.so |
R_HOME/lib64/R/lib/rblas | directory to hold the bundled libRblas.so and libRlapack.so produced by R build procedure |
R_HOME/lib64/R/lib/mkl | directory to hold MKL variants |
R_HOME/lib64/R/lib/mkl/seq | directory to hold sequential MKL variant |
R_HOME/lib64/R/lib/mkl/thr | directory to hold threaded MKL variant |
R BLAS/LAPACK
When we restructured the R lib64/R/lib
directory, the bundled libRblas.so
and libRlapack.so
shared library files were moved to the rblas
subdirectory. To configure R to use its bundled libraries:
$ cd ${R_HOME}/lib64/R/lib $ rm -f libR{blas,lapack}.so $ ln -s rblas/libRblas.so . $ ln -s rblas/libRlapack.so .
ATLAS
The ATLAS library contains both BLAS and LAPACK APIs in a single shared library. With libatlas.so
copied into the R_HOME/lib64/R/lib/atlas
subdirectory, we configure R to use ATLAS:
$ cd ${R_HOME}/lib64/R/lib $ rm -f libR{blas,lapack}.so $ ln -s atlas/libRblas.so . $ ln -s atlas/libRlapack.so .
Sequential MKL
A C source file containing a dummy function was created in R_HOME/lib64/R/lib/mkl/seq
:
- shim.c
int mkl_shim_dummy(void) { return 0; }
The shim library is then created thusly:
$ cd ${R_HOME}/lib64/R/lib/mkl/seq $ icc -shared -o libRblas.so -mkl=sequential shim.c $ ln -s libRblas.so libRlapack.so
To configure R to use the sequential MKL:
$ cd ${R_HOME}/lib64/R/lib $ rm -f libR{blas,lapack}.so $ ln -s mkl/seq/libRblas.so . $ ln -s mkl/seq/libRlapack.so .
Threaded MKL
A C source file containing a dummy function was created in R_HOME/lib64/R/lib/mkl/thr
:
- shim.c
int mkl_shim_dummy(void) { return 0; }
Since our R build used the GNU C compiler, the threaded MKL variant only works if the shim library is built against the GNU OpenMP runtime. Using just "-mkl=parallel" links against the Intel OpenMP runtime which in testing yielded numerical issues (not actual crashes). The shim library is then created thusly:
$ cd ${R_HOME}/lib64/R/lib/mkl/thr $ icc -shared -o libRblas.so shim.c -lmkl_gnu_thread -lmkl_core -lmkl_intel_lp64 $ ln -s libRblas.so libRlapack.so
To configure R to use the threaded MKL:
$ cd ${R_HOME}/lib64/R/lib $ rm -f libR{blas,lapack}.so $ ln -s mkl/thr/libRblas.so . $ ln -s mkl/thr/libRlapack.so .
Runtime-configurable substitution
By stashing each BLAS/LAPACK variant in its own subdirectory, our copy of R is actually fairly close to being runtime-configurable with respect to choice of BLAS/LAPACK. Since all R commands will setup the environment to have the runtime linker check R_HOME/lib64/R/lib
for shared libraries, the libRblas.so
and libRlapack.so
symlinks in that directory will always have priority over any other path we might add to LD_LIBRARY_PATH
prior to issuing the R
command, for example. However, if libRblas.so
and libRlapack.so
are not present in that directory, the runtime linker will be forced to consult other paths present in LD_LIBRARY_PATH
.
On our Caviness cluster we include no BLAS/LAPACK library symlinks in the base directory which R checks for shared libraries:
$ cd ${R_HOME}/lib64/R/lib $ rm -f libR{blas,lapack}.so
In our VALET package definition for R we configure four variants that differ by BLAS/LAPACK (as a feature tag)
$ vpkg_versions r : r The R Project for Statistical Computing 3.5 alias to r/3.5.1 * 3.5.1 R 3.5.1 with system compilers, ATLAS 3.5.1:mkl-seq R 3.5.1 with system compilers, MKL (sequential) 3.5.1:mkl-thr R 3.5.1 with system compilers, MKL (multithread) 3.5.1:rblas R 3.5.1 with system compilers, R reference BLAS/LAPACK
with ATLAS as the default (recommended) choice of underlying BLAS/LAPACK. Each variant of the 3.5.1 version uses the same installation prefix, but adds a unique BLAS/LAPACK subdirectory to LD_LIBRARY_PATH
when added to the user's environment:
$ vpkg_require r/3.5.1 Adding package `r/3.5.1` to your environment $ echo $LD_LIBRARY_PATH /opt/shared/r/3.5.1/lib64/R/lib/atlas: ... $ vpkg_rollback all $ vpkg_require r/3.5.1:rblas Adding package `r/3.5.1:rblas` to your environment $ echo $LD_LIBRARY_PATH /opt/shared/r/3.5.1/lib64/R/lib/rblas:
This works fine so long as you don't attempt to install any R modules that require the BLAS/LAPACK functionality. The R module-building environment defaults to using link search paths in the base directory ( R_HOME/lib64/R/lib
) and fails to find libRblas.so
or libRlapack.so
. This is easily fixed by altering two lines in the R installation's standard make flags file at R_HOME/lib64/R/etc/Makeconf
:
BLAS_LIBS = -L"$(R_HOME)/lib$(R_ARCH)/$(R_BLAS_VARIANT)" -lRblas
and
LAPACK_LIBS = -L"$(R_HOME)/lib$(R_ARCH)/$(R_BLAS_VARIANT)" -lRlapack
The link search added here will default to the usual path if R_BLAS_VARIANT
is not set in the user's environment. But in our VALET configuration of each of the variants, we set the appropriate relative path for R_BLAS_VARIANT
:
$ vpkg_require r/3.5.1:rblas Adding package `r/3.5.1:rblas` to your environment $ echo $R_BLAS_VARIANT rblas $ vpkg_rollback all $ vpkg_require r/3.5.1:mkl-seq Adding package `r/3.5.1:mkl-seq` to your environment $ echo $R_BLAS_VARIANT mkl/seq
We now have a runtime-configurable BLAS/LAPACK for this single installation of R, and any properly-packaged R modules should build fine against it.
VALET configuration
Here is the VALET configuration for our installation of R 3.5.1 with runtime-configurable BLAS/LAPACK:
r: description: The R Project for Statistical Computing url: http://www.r-project.org/ prefix: /opt/shared/r default-version: "3.5.1" versions: "3.5": alias-to: 3.5.1 "3.5.1": description: R 3.5.1 with system compilers, ATLAS actions: - libdir: lib64/R/lib - incdir: lib64/R/include - libdir: lib64/R/lib/atlas - variable: R_BLAS_VARIANT operator: set value: atlas "3.5.1:rblas": description: R 3.5.1 with system compilers, R reference BLAS/LAPACK prefix: 3.5.1 actions: - libdir: lib64/R/lib - incdir: lib64/R/include - libdir: lib64/R/lib/rblas - variable: R_BLAS_VARIANT operator: set value: rblas "3.5.1:mkl-seq": description: R 3.5.1 with system compilers, MKL (sequential) prefix: 3.5.1 actions: - libdir: lib64/R/lib - incdir: lib64/R/include - libdir: lib64/R/lib/mkl/seq - variable: R_BLAS_VARIANT operator: set value: mkl/seq "3.5.1:mkl-thr": description: R 3.5.1 with system compilers, MKL (multithread) prefix: 3.5.1 actions: - libdir: lib64/R/lib - incdir: lib64/R/include - libdir: lib64/R/lib/mkl/thr - variable: R_BLAS_VARIANT operator: set value: mkl/thr