Table of Contents

R: Runtime-configuration BLAS/LAPACK

The R Project for Statistical Computing is used on our clusters by a wide variety of scientific disciplines. Though the breadth of applications is wide, many of them require the functionality of BLAS/LAPACK libraries. R provides its own baseline implementations that will build on any system; naturally, one cannot expect these BLAS/LAPACK libraries to be highly performant relative to implementations like:

The build procedure for R allows the package to be configured for building against external BLAS/LAPACK libraries. Once the base R build has completed and the resulting software has been installed, additional R libraries can be configured and installed atop it. It has been noted in the past that:

  1. Producing N such builds of R that vary only in the choice of underlying BLAS/LAPACK:
    • can require on the order of N times the disk space of a single build
    • puts a greater burden on the sysadmin to maintain all N similarly-outfitted copies
  2. R only makes use of standardized BLAS/LAPACK APIs, so any standard BLAS/LAPACK library should be able to be chosen at runtime (not just build time).

Substituting an alternate library

Others have published articles in the past detailing the substitution of the ATLAS library by doing the following to a basic R build (which was built with its bundled BLAS/LAPACK):

The basic idea is:

This copy of R is configured to use R_HOME/lib64/R/lib to resolve shared libraries, so when executing the R command, for example, the symlinks will lead the runtime linker to the ATLAS library when resolving BLAS/LAPACK functions.

This scheme requires two things:

  1. the user must have ownership of the R installation or sufficient privileges to alter the files
  2. the BLAS/LAPACK substitution will happen on time only (probably shortly after the library is built)

While the first condition is obvious, the second may not seem important, especially for a build of R being maintained by an arbitrary user in an arbitrary location on the filesystem. However, computational reproducibility would demand that any alteration to the underlying BLAS/LAPACK be present – or at least able to be restored – at any time. This is one reason why libatlas.so was copied into the build and symlinks were used: having other BLAS/LAPACK libraries present, the libRblas.so and libRlapack.so symlinks can be altered as necessary. The caveat, however, is that:

A simple way to organize multiple underlying BLAS/LAPACK libraries in a single R installation is to create subdirectories for each variant:

Path Description
R_HOME/lib64/R/lib base directory where R looks for shared libraries by default
R_HOME/lib64/R/lib/libRblas.so symlink to chosen BLAS library (from one of the subdirectories herein)
R_HOME/lib64/R/lib/libRlapack.so symlink to chosen LAPACK library (from one of the subdirectories herein)
R_HOME/lib64/R/lib/atlas directory to hold libatlas.so
R_HOME/lib64/R/lib/rblas directory to hold the bundled libRblas.so and libRlapack.so produced by R build procedure
R_HOME/lib64/R/lib/mkl directory to hold MKL variants
R_HOME/lib64/R/lib/mkl/seq directory to hold sequential MKL variant
R_HOME/lib64/R/lib/mkl/thr directory to hold threaded MKL variant

R BLAS/LAPACK

When we restructured the R lib64/R/lib directory, the bundled libRblas.so and libRlapack.so shared library files were moved to the rblas subdirectory. To configure R to use its bundled libraries:

$ cd ${R_HOME}/lib64/R/lib
$ rm -f libR{blas,lapack}.so
$ ln -s rblas/libRblas.so .
$ ln -s rblas/libRlapack.so .

ATLAS

The ATLAS library contains both BLAS and LAPACK APIs in a single shared library. With libatlas.so copied into the R_HOME/lib64/R/lib/atlas subdirectory, we configure R to use ATLAS:

$ cd ${R_HOME}/lib64/R/lib
$ rm -f libR{blas,lapack}.so
$ ln -s atlas/libRblas.so .
$ ln -s atlas/libRlapack.so .

Sequential MKL

A C source file containing a dummy function was created in R_HOME/lib64/R/lib/mkl/seq:

shim.c
int
mkl_shim_dummy(void)
{
	return 0;
}

The shim library is then created thusly:

$ cd ${R_HOME}/lib64/R/lib/mkl/seq
$ icc -shared -o libRblas.so -mkl=sequential shim.c
$ ln -s libRblas.so libRlapack.so

To configure R to use the sequential MKL:

$ cd ${R_HOME}/lib64/R/lib
$ rm -f libR{blas,lapack}.so
$ ln -s mkl/seq/libRblas.so .
$ ln -s mkl/seq/libRlapack.so .

Threaded MKL

A C source file containing a dummy function was created in R_HOME/lib64/R/lib/mkl/thr:

shim.c
int
mkl_shim_dummy(void)
{
	return 0;
}

Since our R build used the GNU C compiler, the threaded MKL variant only works if the shim library is built against the GNU OpenMP runtime. Using just "-mkl=parallel" links against the Intel OpenMP runtime which in testing yielded numerical issues (not actual crashes). The shim library is then created thusly:

$ cd ${R_HOME}/lib64/R/lib/mkl/thr
$ icc -shared -o libRblas.so shim.c -lmkl_gnu_thread -lmkl_core -lmkl_intel_lp64
$ ln -s libRblas.so libRlapack.so

To configure R to use the threaded MKL:

$ cd ${R_HOME}/lib64/R/lib
$ rm -f libR{blas,lapack}.so
$ ln -s mkl/thr/libRblas.so .
$ ln -s mkl/thr/libRlapack.so .

Runtime-configurable substitution

By stashing each BLAS/LAPACK variant in its own subdirectory, our copy of R is actually fairly close to being runtime-configurable with respect to choice of BLAS/LAPACK. Since all R commands will setup the environment to have the runtime linker check R_HOME/lib64/R/lib for shared libraries, the libRblas.so and libRlapack.so symlinks in that directory will always have priority over any other path we might add to LD_LIBRARY_PATH prior to issuing the R command, for example. However, if libRblas.so and libRlapack.so are not present in that directory, the runtime linker will be forced to consult other paths present in LD_LIBRARY_PATH.

On our Caviness cluster we include no BLAS/LAPACK library symlinks in the base directory which R checks for shared libraries:

$ cd ${R_HOME}/lib64/R/lib
$ rm -f libR{blas,lapack}.so

In our VALET package definition for R we configure four variants that differ by BLAS/LAPACK (as a feature tag)

$ vpkg_versions r
   :
r                The R Project for Statistical Computing
  3.5            alias to r/3.5.1
* 3.5.1          R 3.5.1 with system compilers, ATLAS
  3.5.1:mkl-seq  R 3.5.1 with system compilers, MKL (sequential)
  3.5.1:mkl-thr  R 3.5.1 with system compilers, MKL (multithread)
  3.5.1:rblas    R 3.5.1 with system compilers, R reference BLAS/LAPACK

with ATLAS as the default (recommended) choice of underlying BLAS/LAPACK. Each variant of the 3.5.1 version uses the same installation prefix, but adds a unique BLAS/LAPACK subdirectory to LD_LIBRARY_PATH when added to the user's environment:

$ vpkg_require r/3.5.1
Adding package `r/3.5.1` to your environment
$ echo $LD_LIBRARY_PATH
/opt/shared/r/3.5.1/lib64/R/lib/atlas: ...
$ vpkg_rollback all
$ vpkg_require r/3.5.1:rblas
Adding package `r/3.5.1:rblas` to your environment
$ echo $LD_LIBRARY_PATH
/opt/shared/r/3.5.1/lib64/R/lib/rblas:

This works fine so long as you don't attempt to install any R modules that require the BLAS/LAPACK functionality. The R module-building environment defaults to using link search paths in the base directory ( R_HOME/lib64/R/lib) and fails to find libRblas.so or libRlapack.so. This is easily fixed by altering two lines in the R installation's standard make flags file at R_HOME/lib64/R/etc/Makeconf:

BLAS_LIBS = -L"$(R_HOME)/lib$(R_ARCH)/$(R_BLAS_VARIANT)" -lRblas

and

LAPACK_LIBS = -L"$(R_HOME)/lib$(R_ARCH)/$(R_BLAS_VARIANT)" -lRlapack

The link search added here will default to the usual path if R_BLAS_VARIANT is not set in the user's environment. But in our VALET configuration of each of the variants, we set the appropriate relative path for R_BLAS_VARIANT:

$ vpkg_require r/3.5.1:rblas
Adding package `r/3.5.1:rblas` to your environment
$ echo $R_BLAS_VARIANT 
rblas
$ vpkg_rollback all
$ vpkg_require r/3.5.1:mkl-seq
Adding package `r/3.5.1:mkl-seq` to your environment
$ echo $R_BLAS_VARIANT 
mkl/seq

We now have a runtime-configurable BLAS/LAPACK for this single installation of R, and any properly-packaged R modules should build fine against it.

VALET configuration

Here is the VALET configuration for our installation of R 3.5.1 with runtime-configurable BLAS/LAPACK:

r:
  description:       The R Project for Statistical Computing
  url:               http://www.r-project.org/
  prefix:            /opt/shared/r

  default-version:   "3.5.1"

  versions:
    "3.5":
      alias-to:      3.5.1
    "3.5.1":
      description:   R 3.5.1 with system compilers, ATLAS
      actions:
        - libdir:    lib64/R/lib
        - incdir:    lib64/R/include
        - libdir:    lib64/R/lib/atlas
        - variable:  R_BLAS_VARIANT
          operator:  set
          value:     atlas

    "3.5.1:rblas":
      description:   R 3.5.1 with system compilers, R reference BLAS/LAPACK
      prefix:        3.5.1
      actions:
        - libdir:    lib64/R/lib
        - incdir:    lib64/R/include
        - libdir:    lib64/R/lib/rblas
        - variable:  R_BLAS_VARIANT
          operator:  set
          value:     rblas

    "3.5.1:mkl-seq":
      description:   R 3.5.1 with system compilers, MKL (sequential)
      prefix:        3.5.1
      actions:
        - libdir:    lib64/R/lib
        - incdir:    lib64/R/include
        - libdir:    lib64/R/lib/mkl/seq
        - variable:  R_BLAS_VARIANT
          operator:  set
          value:     mkl/seq

    "3.5.1:mkl-thr":
      description:   R 3.5.1 with system compilers, MKL (multithread)
      prefix:        3.5.1
      actions:
        - libdir:    lib64/R/lib
        - incdir:    lib64/R/include
        - libdir:    lib64/R/lib/mkl/thr
        - variable:  R_BLAS_VARIANT
          operator:  set
          value:     mkl/thr