Adding your own library of R modules (packages) in R_LIBS
The following instructions were adapted from installing personal/program specific R libraries and extensions on Caviness.
List of packages requested to be installed
- rlang - needed to install an updated version for other requested packages
- Rcpp - needed to install an updated version for other requested packages
- ranger - install first gets loaded by VSURF
- randomForest
- VSURF
- plyr
- ipred
- e1071
- quantregForest
- car
- rgdal
- rasterVis
- Raster
- units - needed to install an updated version for other requested packages
- sf - needed updated version
- reproducible - needed updated version
- SpaDES
- foreach
- doParallel
- openxlsx
- DescTools
- ggplot2
Packages in bold were not requested, but necessary in order to update the R packages requested.
There is much research required to determine what other software needs to be loaded via VALET, plus which packages need to be updated and in what order so the list of requested packages will install correctly. For example, SpaDES, is one of the packages we want to install so use SpaDES: Develop and Run Spatially Explicit Discrete Event Simulation Models to see the dependencies and imports required. This step should be repeated for each package.
Preparations
Make sure you connect to Caviness with X11 enable (Xming for Windows, XQuartz for Mac) before starting as some of the packages need X11 to compile properly.
Next make sure you are in your workgroup (ie. workgroup -g «investing_entity»
). We will be using account traine
in workgroup it_css
for this recipe.
[traine@login01 ~]$ workgroup [(it_css:traine@login01 ~]$
Next choose a directory in which to install the R packages. This will depend on a number of factors, but mostly the version of R. The strings «r-version»
and «rpkgs-date»
will denote the version of R and the name chosen for the R packages dated – this recipe will use r3.5.1
and spatial-09052020
. Do not locate this directory under the /lustre/scratch
file system; typically a directory under the workgroup's storage is appropriate:
- If adding the R libraries for multiple users in the workgroup, choose
${WORKDIR}/sw/r/add-ons/«r-version»/«rpkgs-date»/default
as the base directory. - If the R libraries are solely for personal use, choose
${WORKDIR}/users/<username>/sw/r/add-ons/«r-version»/«rpkgs-date»/default
, for example.
Note that these examples assume a standard workgroup storage layout with group-writable sw
and users
directories at the top level. Create the directory:
[(it_css:traine)@login01 ~]$ mkdir -p ${WORKDIR}/sw/r/add-ons/r3.5.1/spatial-09052020/default
Now load all the packages via VALET that are needed to install the R libraries and add the new path to the R_LIBS
environment variable for the installation. Due to the number of packages and environment variables required, it is best to create a script called setup-«rpkgs-date».sh
using nano
or vim
. The script for this recipe will be called [(it_css:traine)@login01 ~]$
aserves as a document for future installations to know what packages and environment variables were required for the installation.
[(it_css:traine)@login01 ~]$ cat setup-spatial-09052020.sh #!/bin/bash # usage: source install-spatial-09052020.sh # clear environment vpkg_rollback all # Load software via VALET and set environment variables # that are needed to install requested R packages: # ranger VSURF plyr randomForest ipred e1071 quantregForest # car rgdal raster rasterVis SpaDES foreach doParallel # openxlsx DescTools ggplot2 vpkg_devrequire r/3.5.1:mkl-thr vpkg_devrequire r-cran/3.5.1:20180715 vpkg_devrequire gdal/2.3.0 vpkg_devrequire proj/5.1.0 vpkg_devrequire netcdf/4.6.1 vpkg_devrequire udunits/2.2.26 export UDUNITS2_LIBS=${UDUNITS_PREFIX}/lib export UDUNITS2_INCLUDE=${UDUNITS_PREFIX}/include vpkg_devrequire geos/3.6.2 export GEOS_DIR=${GEOS_PREFIX} # Add the new R library path created to the R_LIBS environment for R to find # (i.e.) based creating the directory using # "mkdir -p ${WORKDIR}/sw/r/add-ons/r3.5.1/spatial-09052020/default" (see above) R_LIBS="${WORKDIR}/sw/r/add-ons/r3.5.1/spatial-09052020/default:${R_LIBS}" # display R_LIBS with the new R libraries directory added echo ${R_LIBS}
Now that we have the setup script, we can run it to setup our environment needed before installing the R packages by doing
[traine@login01 ~]$ source install-spatial-09052020.sh ERROR: no previous session on record, unable to roll back Adding dependency `gcc/4.9.4` to your environment Adding dependency `intel/2018u4` to your environment Adding package `r/3.5.1:mkl-thr` to your environment Adding package `r-cran/3.5.1:20180715` to your environment Adding package `gdal/2.3.0` to your environment Adding package `proj/5.1.0` to your environment Adding dependency `szip/2.1.1` to your environment Adding dependency `hdf4/4.2.13` to your environment Adding dependency `hdf5/1.10.2` to your environment Adding package `netcdf/4.6.1` to your environment Adding package `udunits/2.2.26` to your environment Adding package `geos/3.6.2` to your environment /opt/shared/r/add-ons/r3.5.1/cran/20180715 /work/it_css/sw/r/add-ons/r3.5.1/spatial-09052020/default:/opt/shared/r/add-ons/r3.5.1/cran/20180715 [(it_css:traine)@login01 ~]$
The first time you run the above script it will give you a VALET ERROR: no previous session on record, unable to roll back
if you have not loaded any other packages with VALET prior to running it.
Install the R Packages
In most cases, creating a R script using nano
or vim
will be necessary due to the large number of packages and information required to install each package such as install-«rpkgs-date».R
. Using a R script is better suited due to the amount of typing, potential mistakes, changes required based on errors, and again a way of documenting your steps for future installations. For this example we will call it install-spatial-09052020.R
to match our setup script naming convention.
[(it_css:traine)@login01 ~]$ cat install-spatial-09052020.R # usage: R CMD BATCH install-spatial-09052020.R & .libPaths() chooseCRANmirror(ind = 56) install.packages("rlang", type = "source", dependencies=TRUE) library(rlang) install.packages("Rcpp", type = "source", dependencies=TRUE) library(Rcpp) install.packages("ranger", type = "source", configure.args=c('--with-proj-lib=/opt/shared/proj/5.1.0/lib', '--with-proj-include=/opt/shared/proj/5.1.0/include'), dependencies = TRUE) library(ranger) install.packages("randomForest", type = "source", configure.args=c('--with-proj-lib=/opt/shared/proj/5.1.0/lib', '--with-proj-include=/opt/shared/proj/5.1.0/include'), dependencies = TRUE) library(randomForest) install.packages("VSURF", type = "source", configure.args=c('--with-proj-lib=/opt/shared/proj/5.1.0/lib', '--with-proj-include=/opt/shared/proj/5.1.0/include'), dependencies = TRUE) library(VSURF) install.packages("plyr", type = "source", configure.args=c('--with-proj-lib=/opt/shared/proj/5.1.0/lib', '--with-proj-include=/opt/shared/proj/5.1.0/include'), dependencies = TRUE) library(plyr) install.packages("ipred", type = "source", configure.args=c('--with-proj-lib=/opt/shared/proj/5.1.0/lib', '--with-proj-include=/opt/shared/proj/5.1.0/include'), dependencies = TRUE) library(ipred) install.packages("e1071", type = "source", configure.args=c('--with-proj-lib=/opt/shared/proj/5.1.0/lib', '--with-proj-include=/opt/shared/proj/5.1.0/include'), dependencies = TRUE) library(e1071) install.packages("quantregForest", type = "source", configure.args=c('--with-proj-lib=/opt/shared/proj/5.1.0/lib', '--with-proj-include=/opt/shared/proj/5.1.0/include'), dependencies = TRUE) library(quantregForest) install.packages("car", type = "source", configure.args=c('--with-proj-lib=/opt/shared/proj/5.1.0/lib', '--with-proj-include=/opt/shared/proj/5.1.0/include'), dependencies = TRUE) install.packages("units", type = "source", configure.args=c('--with-udunits2-lib=/opt/shared/udunits/2.2.26/lib', '--with-udunits2-include=/opt/shared/udunits/2.2.26/include'), dependencies=TRUE) library(units) install.packages("rgdal", type = "source", configure.args=c('--with-proj-lib=/opt/shared/proj/5.1.0/lib', '--with-proj-include=/opt/shared/proj/5.1.0/include'), dependencies = TRUE) library(rgdal) install.packages("rasterVis", type = "source", configure.args=c('--with-proj-lib=/opt/shared/proj/5.1.0/lib', '--with-proj-include=/opt/shared/proj/5.1.0/include'), dependencies = TRUE) library(rasterVis) install.packages("raster", type = "source", configure.args=c('--with-proj-lib=/opt/shared/proj/5.1.0/lib', '--with-proj-include=/opt/shared/proj/5.1.0/include'), dependencies = TRUE) install.packages("sf", dependencies = TRUE) library(sf) install.packages("reproducible", type = "source", dependencies = TRUE) library(reproducible) install.packages("SpaDES", type = "source", configure.args=c('--with-udunits2-lib=/opt/shared/udunits/2.2.26/lib', '--with-udunits2-include=/opt/shared/udunits/2.2.26/include'), dependencies=TRUE) library(SpaDES) library(foreach) library(doParallel) library(openxlsx) library(DescTools) library(ggplot2)
Now install the R libraries by running the R script as follows
[traine@login01 ~]$ R CMD BATCH install-spatial-09052020.R & [traine@login01 ~]$
and by using &
it will run the R script in the background since this installation may take a long time depending on the number of packages. An output file called install-spatial-rpkgs.Rout
is created to keep track of the installation. You can watch the progress by using
[traine@login01 ~]$ tail install-spatial-09052020.Rout
Review Installation Output
Review the generated output file, install-spatial-09052020.Rout
, for details on each package installation. If there are errors, fix the install R script and rerun.
Final List of R packages Installed
$ ls $WORKDIR/sw/r/add-ons/r3.5.1/spatial-09052020/default car leafem quantregForest rgdal SpaDES.tools dplyr leaflet quickPlot rgeos stars e1071 leafpop randomForest rlang tibble ellipsis leafsync ranger scales tidyselect fansi mapview raster sf units gdalUtils ncdf4 rasterVis SpaDES vctrs ggforce pillar Rcpp SpaDES.addins VSURF ipred plyr reproducible SpaDES.core
VALET Package for R Packages
Last step is to create a VALET package to load the correct version of R and your newly installed R packages. The easiest way to do this is to copy the VALET package for the version of R you used for the installation, add your R_LIBS and load the dependent packages. For this recipe, we will call the VALET package spatial-r.vpkg_yaml
and it should be put in the workgroup sw
directory in valet
so it will be found when using your workgroup.
[(it_css:traine)@login01 valet]$ pwd /work/it_css/sw/valet [(it_css:traine)@login01 valet]$ cat spatial-r.vpkg_yaml spatial-r: description: The Comprehensive R Archive Network url: http://cran.us.r-project.org/ prefix: /opt/shared/r/add-ons default-version: "3.5.1:20180715" development-env: false actions: - action: path-prepend variable: R_LIBS value: ${VALET_PATH_PREFIX} versions: "3.5.1:20180715": description: CRAN snapshot from 07/15/2018, includes 11,486 R modules prefix: r3.5.1/cran/20180715 dependencies: - r/3.5.1:mkl-thr - gdal/2.3.0 - proj/5.1.0 - netcdf/4.6.1 - udunits/2.2.26 - geos/3.6.2 actions: - action: path-prepend variable: R_LIBS value: /work/it_css/sw/r/add-ons/r3.5.1/spatial-09052020/default
Testing R packages
Now that you have installed your R packages, let's test by using VALET to setup our new environment by first clearing it and then using our new VALET package
[(it_css:traine)@login01 ~]$ vpkg_rollback all [(it_css:traine)@login01 ~]$ vpkg_require spatial-r