===== R on Caviness =====
==== Learning R ====
== SWIRL ===
In addition to other resources, SWIRL is installed on the Caviness cluster and is available as an interactive learning guide
inside R:
$ vpkg_require r-cran
$ R -q --no-save
> library(swirl)
> swirl()
==== R libraries and extensions ====
=== Installed library bundles ===
The cluster also has the majority of [[http://cran.us.r-project.org/|CRAN]]
and [[http://www.bioconductor.org/|Bioconductor]] R libraries already
insalled. These are installed as point-in-time snapshots of their
respective catalogs. These libraries are broken down into different valet
packages based on dependencies. The current bundles are below. Together
these bundles provide access to over 6,600 R modules, pre-compiled and ready
for use.
^r-cran |All CRAN modules in CRAN which compile and install cleanly without any additional dependencies. N.B. all below library packs require this CRAN modle as a base.|
^r-cdf |CRAN modules which need NetCDF, HDF4, HDF5, and UDUNITS libraries. |
^r-bioc |The full suite of[[http://www.bioconductor.org/|Bioconductor]] modules. |
^r-fftw |CRAN modules which need FFTW |
^r-geo |CRAN modules which need GEOS(Geometry Engine, Open Source), GDAL(Geospatial Data Abstraction Library), or PROJ (Cartographic Projections Library) |
^r-gnumath |CRAN modules which need GSL(GNU Scientific Library), GLPK(GNU Linear Programming Kit), or MPFR(GNU MPFR Library) |
^r-jags |CRAN modules which need JAGS(Just Another Gibbs Sampler) and the r-gnumath library mentioned above. |
^r-graph |CRAN modules which need Graphviz or GNUplot |
^r-mpi |CRAN modules which need the OpenMPI libraries for parallel computing. |
^r-all |In addition to loading all the previously mentioned bundles, and CRAN module with multiple dependencies from the above list is also included. |
^r-cuda |CRAN modules which need CUDA/GPUs |
=== Loading library bundles for use ===
$ vpkg_require r-geo
Adding dependency `r-bioc/3.5.1:20180715` to your environment
Adding dependency `gsl/1.16` to your environment
Adding dependency `gmp/6.1.2` to your environment
Adding dependency `glpk/4.65` to your environment
Adding dependency `mpfr/4.0.1` to your environment
Adding dependency `r-gnumath/3.5.1:20180715` to your environment
Adding dependency `fftw/3.3.8` to your environment
Adding dependency `r-fftw/3.5.1:20180715` to your environment
Adding dependency `szip/2.1.1` to your environment
Adding dependency `hdf4/4.2.13` to your environment
Adding dependency `hdf5/1.10.2` to your environment
Adding dependency `netcdf/4.6.1` to your environment
Adding dependency `udunits/2.2.26` to your environment
Adding dependency `r-cdf/3.5.1:20180715` to your environment
Adding dependency `geos/3.6.2` to your environment
Adding dependency `gdal/2.3.0` to your environment
Adding dependency `proj/5.1.0` to your environment
Adding package `r-geo/3.5.1:20180715` to your environment
$
Now using the library in R can be done as normal.
$ R --no-save -q
> library(CopulaRegression)
Loading required package: MASS
Loading required package: VineCopula
>
=== Learning about modules ===
IT provides a small script called ''r-info'' which will display the internal
documentation of R modules. This is helpful to get basic information on
a module to decide if it requires more research. To use this tool, the library
must be installed, and the module bundle must be loaded with ''vpkg_require''.
For example:
$ vpkg_require r-cran
$ r-info car
Loading required package: carData
Information on package ‘car’
Description:
Package: car
Version: 3.0-0
Date: 2018-03-23
Title: Companion to Applied Regression
...
Further information is available in the following vignettes in
directory ‘/opt/shared/r/add-ons/r3.5.1/cran/20180715/car/doc’:
embedding: Using car functions inside user functions (source, pdf)
$
==== personal/program specific R libraries and extensions ====
You can create your own library of R modules which contains different
versions than provided through VALET, or modules not available via VALET.
R looks in an environment variable called ''R_LIBS'' to obtain a list of
locations to search for modules. You should ensure your entry is first
in the list, this will allow your library to override any conflicts which
may be installed on the system. This is also important, because R installs
modules into the first entry in this list by default.
=== Simple example ===
Once this is done, you can install by using ''install.packages''. Make sure you are in your workgroup (e.g. ''workgroup -g </investing-entity//>>''. Here
is an example:
$ workgroup -g it_css
$ vpkg_require r-cran
Adding dependency `r/3.5.1` to your environment
Adding package `r-cran/3.5.1:20180715` to your environment
$ mkdir -p $WORKDIR/sw/r/add-ons/r3.5.1/testing/default
$ echo $R_LIBS
/opt/shared/r/add-ons/r3.5.1/cran/20180715
$ export R_LIBS="$WORKDIR/sw/r/add-ons/r3.5.1/testing/default:$R_LIBS"
$ R -q --no-save
> .libPaths()
[1] "/work/it_css/sw/r/add-ons/r3.5.1/testing/default"
[2] "/opt/shared/r/add-ons/r3.5.1/cran/20180715"
[3] "/opt/shared/r/3.5.1/lib64/R/library"
> chooseCRANmirror(all)
Secure CRAN mirrors
1: 0-Cloud [https] 2: Algeria [https]
3: Australia (Canberra) [https] 4: Australia (Melbourne 1) [https]
5: Australia (Melbourne 2) [https] 6: Australia (Perth) [https]
7: Austria [https] 8: Belgium (Ghent) [https]
9: Brazil (PR) [https] 10: Brazil (RJ) [https]
11: Brazil (SP 1) [https] 12: Brazil (SP 2) [https]
13: Bulgaria [https] 14: Chile [https]
15: China (Hong Kong) [https] 16: China (Lanzhou) [https]
17: China (Shanghai) [https] 18: Colombia (Cali) [https]
19: Czech Republic [https] 20: Denmark [https]
21: Ecuador (Cuenca) [https] 22: Ecuador (Quito) [https]
23: Estonia [https] 24: France (Lyon 2) [https]
25: France (Marseille) [https] 26: France (Montpellier) [https]
27: Germany (Erlangen) [https] 28: Germany (Göttingen) [https]
29: Germany (Münster) [https] 30: Germany (Regensburg) [https]
31: Greece [https] 32: Hungary [https]
33: Iceland [https] 34: Indonesia (Jakarta) [https]
35: Italy (Padua) [https] 36: Japan (Tokyo) [https]
37: Japan (Yonezawa) [https] 38: Korea (Busan) [https]
39: Korea (Gyeongsan-si) [https] 40: Korea (Seoul 1) [https]
41: Korea (Ulsan) [https] 42: Malaysia [https]
43: Mexico (Mexico City) [https] 44: Norway [https]
45: Philippines [https] 46: Serbia [https]
47: Spain (Madrid) [https] 48: Sweden [https]
49: Switzerland [https] 50: Turkey (Denizli) [https]
51: Turkey (Mersin) [https] 52: UK (Bristol) [https]
53: UK (London 1) [https] 54: USA (CA 1) [https]
55: USA (IA) [https] 56: USA (KS) [https]
57: USA (MI 1) [https] 58: USA (MI 2) [https]
59: USA (OR) [https] 60: USA (TN) [https]
61: USA (TX 1) [https] 62: Uruguay [https]
63: (other mirrors)
Selection: 55
> install.packages("KernSmooth", dependencies=TRUE)
Installing package into ‘/work/it_css/sw/r/add-ons/r3.5.1/testing/default’
(as ‘lib’ is unspecified)
trying URL 'https://ftp.osuosl.org/pub/cran/src/contrib/KernSmooth_2.23-15.tar.gz'
Content type 'application/x-gzip' length 24572 bytes (23 KB)
==================================================
downloaded 23 KB
* installing *source* package ‘KernSmooth’ ...
** package ‘KernSmooth’ successfully unpacked and MD5 sums checked
** libs
gfortran -fpic -g -O2 -c blkest.f -o blkest.o
gfortran -fpic -g -O2 -c cp.f -o cp.o
gfortran -fpic -g -O2 -c dgedi.f -o dgedi.o
gfortran -fpic -g -O2 -c dgefa.f -o dgefa.o
gfortran -fpic -g -O2 -c dgesl.f -o dgesl.o
gcc -std=gnu99 -I"/opt/shared/r/3.5.1/lib64/R/include" -DNDEBUG -I/opt/shared/gcc/4.9.4/include -fpic -g -O2 -c init.c -o init.o
gfortran -fpic -g -O2 -c linbin.f -o linbin.o
gfortran -fpic -g -O2 -c linbin2D.f -o linbin2D.o
gfortran -fpic -g -O2 -c locpoly.f -o locpoly.o
gfortran -fpic -g -O2 -c rlbin.f -o rlbin.o
gfortran -fpic -g -O2 -c sdiag.f -o sdiag.o
gfortran -fpic -g -O2 -c sstdiag.f -o sstdiag.o
gcc -std=gnu99 -shared -L/opt/shared/r/3.5.1/lib64/R/lib -L/opt/shared/gcc/4.9.4/lib -L/opt/shared/gcc/4.9.4/lib64 -o KernSmooth.so blkest.o cp.o dgedi.o dgefa.o dgesl.o init.o linbin.o linbin2D.o locpoly.o rlbin.o sdiag.o sstdiag.o -L/opt/shared/r/3.5.1/lib64/R/lib/atlas -lRblas -lgfortran -lm -lquadmath -lgfortran -lm -lquadmath -L/opt/shared/r/3.5.1/lib64/R/lib -lR
installing to /work/it_css/sw/r/add-ons/r3.5.1/testing/default/KernSmooth/libs
** R
** inst
** byte-compile and prepare package for lazy loading
** help
*** installing help indices
** building package indices
** testing if installed package can be loaded
* DONE (KernSmooth)
The downloaded source packages are in
‘/tmp/RtmpVq5oBb/downloaded_packages’
> library(KernSmooth)
KernSmooth 2.23 loaded
Copyright M. P. Wand 1997-2009
>
Notice that the output of ''.libPaths()'' specifies my personal library directory first? Also it is very important to make sure all dependencies for a library are installed by using the option ''dependencies=TRUE'' for ''install.packages''. It may be necessary to install a dependency first before trying to install the library you want if the developers of a particular library did not provide the dependency as part of the install. In the complex example below, even though ''terra'' doesn't indicate a dependency for ''quantregForest'' it may need to be installed using ''install.packages("quantregForest", dependencies=TRUE)'' first before doing ''install.packages("terra", dependencies=TRUE)''. Unfortunately much of this requires reviewing the developer documentation for each library to determine system requirements, dependencies and noting these dependencies may not be available or automatically installed. On top of that some dependencies require other packages to be installed first, so it can become a catch 22 situation and require trial and error to find the correct order such as needing to install ''bslib'' which is needed for ''rmarkdown'' which is needed for ''quantregForest'' installation. Lastly, make sure you look at what VALET packages are already available for installing R libraries such as the version of the compiler necessary and other system libraries like ''gdal'', ''proj'', ''geos'', etc. Using ''vpkg_devrequire'' will load other dependencies as well set environment variables such as ''LDFLAGS'', ''CPPFLAGS'', ''LIBRARY_PATH'' and ''LD_LIBRARY_PATH'' as well as ''PREFIX'' variables that are necessary in order to find the correct files during the installation of a particular R library. For example, by loading ''gdal'' and ''geos'' we get a number of other dependencies loaded in our environment as well the appropriate environment variables that will be needed by R to install a library. As part of this trial and error, ''udunits'' failed which required quitting from R, then using ''vpkg_devrequire udunits/2.2'' to load into the environment, setting the environment variables ''UDUNITS2_INCLUDE'' and ''UDUNITS2_LIBS'' based on the environment variable ''UDUNITS_PREFIX'' by VALET, and getting back in to R to try the install again. This is basically the trial and error process involved. Finally you will need to load all the same packages in VALET using ''vpkg_require'' and set any environment variables before trying to use the libraries you installed in your ''R_LIBS'' for this session.
=== Complex example ===
[traine@login01.caviness ~]$ workgroup -g it_css
[(it_css:traine)@login01.caviness ~]$ vpkg_devrequire r/4.1 gdal/3.4.3 geos/3.9.1
Adding dependency `binutils/2.35` to your environment
Adding dependency `gcc/11.2.0` to your environment
Adding dependency `atlas/3.10.3` to your environment
Adding package `r/4.1.3` to your environment
Adding dependency `szip/2.1.1` to your environment
Adding dependency `hdf4/4.2.13` to your environment
Adding dependency `hdf5/1.10.2` to your environment
Adding dependency `netcdf/4.6.1` to your environment
Adding dependency `sqlite3/3.34.1` to your environment
Adding dependency `proj/8.2.1` to your environment
Adding package `gdal/3.4.3` to your environment
Adding package `geos/3.9.1` to your environment
[(it_css:traine)@login01.caviness ~]$ echo $LDFLAGS
-L/opt/shared/binutils/2.35/lib -L/opt/shared/gcc/11.2.0/lib -L/opt/shared/gcc/11.2.0/lib64 -L/opt/shared/atlas/30.3/lib -L/opt/shared/r/4.1.3/lib64 -L/opt/shared/r/4.1.3/lib64/R/lib -L/opt/shared/r/4.1.3/lib64/R/lib/atlas -L/t/shared/szip/2.1.1/lib -L/opt/shared/hdf4/4.2.13/lib -L/opt/shared/hdf5/1.10.2/lib -L/opt/shared/netcdf/4.6.1/li-L/opt/shared/sqlite3/3.34.1/lib -L/opt/shared/proj/8.2.1/lib -L/opt/shared/gdal/3.4.3/lib -L/opt/shared/geos/3.9/lib
[(it_css:traine)@login01.caviness ~]$ echo $CPPFLAGS
-I/opt/shared/binutils/2.35/include -I/opt/shared/gcc/11.2.0/include -I/opt/shared/atlas/3.10.3/include -I/opt/shared/r/4.1.3/include -I/opt/shared/r/4.1.3/lib64/R/include -I/opt/shared/szip/2.1.1/include -I/opt/shared/hdf4/4.2.13/include -I/opt/shared/hdf5/1.10.2/include -I/opt/shared/netcdf/4.6.1/include -I/opt/shared/sqlite3/3.34.1/include -I/opt/shared/proj/8.2.1/include -I/opt/shared/gdal/3.4.3/include -I/opt/shared/geos/3.9.1/include
[(it_css:traine)@login01.caviness ~]$ echo $LIBRARY_PATH
/opt/shared/geos/3.9.1/lib:/opt/shared/gdal/3.4.3/lib:/opt/shared/proj/8.2.1/lib:/opt/shared/sqlite3/3.34.1/lib:/opt/shared/netcdf/4.6.1/lib:/opt/shared/hdf5/1.10.2/lib:/opt/shared/hdf4/4.2.13/lib:/opt/shared/szip/2.1.1/lib:/opt/shared/r/4.1.3/lib64/R/lib/atlas:/opt/shared/r/4.1.3/lib64/R/lib:/opt/shared/r/4.1.3/lib64:/opt/shared/atlas/3.10.3/lib:/opt/shared/gcc/11.2.0/lib64:/opt/shared/gcc/11.2.0/lib:/opt/shared/binutils/2.35/lib
[(it_css:traine)@login01.caviness ~]$ echo $LD_LIBRARY_PATH
/opt/shared/geos/3.9.1/lib:/opt/shared/gdal/3.4.3/lib:/opt/shared/proj/8.2.1/lib:/opt/shared/sqlite3/3.34.1/lib:/opt/shared/netcdf/4.6.1/lib:/opt/shared/hdf5/1.10.2/lib:/opt/shared/hdf4/4.2.13/lib:/opt/shared/szip/2.1.1/lib:/opt/shared/r/4.1.3/lib64/R/lib/atlas:/opt/shared/r/4.1.3/lib64/R/lib:/opt/shared/r/4.1.3/lib64:/opt/shared/atlas/3.10.3/lib:/opt/shared/gcc/11.2.0/lib64:/opt/shared/gcc/11.2.0/lib:/opt/shared/binutils/2.35/lib:/opt/shared/slurm/lib
[(it_css:traine)@login01.caviness ~]$ env | grep PREFIX
R_PREFIX=/opt/shared/r/4.1.3
GDAL_PREFIX=/opt/shared/gdal/3.4.3
GCC_PREFIX=/opt/shared/gcc/11.2.0
GEOS_PREFIX=/opt/shared/geos/3.9.1
HDF5_PREFIX=/opt/shared/hdf5/1.10.2
SQLITE3_PREFIX=/opt/shared/sqlite3/3.34.1
PROJ_PREFIX=/opt/shared/proj/8.2.1
NETCDF_PREFIX=/opt/shared/netcdf/4.6.1
GQUEUE_PREFIX=/opt/shared/gqueue
HDF4_PREFIX=/opt/shared/hdf4/4.2.13
BINUTILS_PREFIX=/opt/shared/binutils/2.35
SZIP_PREFIX=/opt/shared/szip/2.1.1
ATLAS_PREFIX=/opt/shared/atlas/3.10.3
[(it_css:traine)@login01.caviness ~]$ mkdir -p $WORKDIR/sw/r/add-ons/r4.1.3/terra
[(it_css:traine)@login01.caviness ~]$ export R_LIBS="$WORKDIR/sw/r/add-ons/r4.1.3/terra"
[(it_css:traine)@login01.caviness ~]$ R -q --no-save
> .libPaths()
[1] "/work/it_css/sw/r/add-ons/r4.1.3/terra"
[2] "/opt/shared/r/4.1.3/lib64/R/library"
> chooseCRANmirror(all)
Secure CRAN mirrors
1: 0-Cloud [https]
2: Australia (Canberra) [https]
3: Australia (Melbourne 1) [https]
4: Australia (Melbourne 2) [https]
5: Australia (Perth) [https]
6: Austria [https]
...
69: USA (IA) [https]
70: USA (MI) [https]
71: USA (MO) [https]
72: USA (OH) [https]
73: USA (OR) [https]
74: USA (TN) [https]
75: United Arab Emirates [https]
76: Uruguay [https]
77: (other mirrors)
Selection: 69
> install.packages("quantregForest", dependencies=TRUE)
Installing package into ‘/work/it_css/sw/r/add-ons/r4.1.3/terra’
(as ‘lib’ is unspecified)
also installing the dependencies ‘fs’, ‘R6’, ‘rappdirs’, ‘base64enc’, ‘cachem’, ‘lifecycle’, ‘memoise’, ‘mime’, ‘rlang’, ‘sass’, ‘digest’, ‘ellipsis’, ‘fastmap’, ‘cli’, ‘glue’, ‘magrittr’, ‘stringi’, ‘vctrs’, ‘evaluate’, ‘highr’, ‘xfun’, ‘yaml’, ‘bslib’, ‘fontawesome’, ‘htmltools’, ‘jquerylib’, ‘jsonlite’, ‘stringr’, ‘tinytex’, ‘randomForest’, ‘RColorBrewer’, ‘gss’, ‘knitr’, ‘rmarkdown’
trying URL 'https://mirror.las.iastate.edu/CRAN/src/contrib/fs_1.6.3.tar.gz'
Content type 'application/x-gzip' length 1185603 bytes (1.1 MB)
==================================================
downloaded 1.1 MB
...
...
...
ERROR: dependency ‘bslib’ is not available for package ‘rmarkdown’
* removing ‘/work/it_css/sw/r/add-ons/r4.1.3/terra/rmarkdown’
The downloaded source packages are in
‘/tmp/RtmpaBulVZ/downloaded_packages’
Warning message:
In install.packages("quantregForest", dependencies = TRUE) :
installation of package ‘rmarkdown’ had non-zero exit status
> library (quantregForest)
Loading required package: randomForest
randomForest 4.7-1.1
Type rfNews() to see new features/changes/bug fixes.
Loading required package: RColorBrewer
> install.packages("bslib", dependencies=TRUE)
Installing package into ‘/work/it_css/sw/r/add-ons/r4.1.3/terra’
(as ‘lib’ is unspecified)
also installing the dependencies ‘colorspace’, ‘utf8’, ‘prettyunits’, ‘labeling’, ‘munsell’, ‘viridisLite’, ‘fansi’, ‘pillar’, ‘pkgconfig’, ‘Rcpp’, ‘rprojroot’, ‘pkgbuild’, ‘diffobj’, ‘rematch2’, ‘gtable’, ‘isoband’, ‘scales’, ‘tibble’, ‘httpuv’, ‘xtable’, ‘sourcetools’, ‘later’, ‘promises’, ‘crayon’, ‘commonmark’, ‘brio’, ‘callr’, ‘desc’, ‘pkgload’, ‘praise’, ‘processx’, ‘ps’, ‘waldo’, ‘farver’, ‘rstudioapi’, ‘bsicons’, ‘curl’, ‘ggplot2’, ‘rmarkdown’, ‘shiny’, ‘testthat’, ‘thematic’, ‘withr’
...
...
checking for udunits2.h... no
checking for udunits2/udunits2.h... no
checking for ut_read_xml in -ludunits2... yes
configure: error: in `/tmp/Rtmpc5hPQz/R.INSTALL540e8c2492d/units':
configure: error:
--------------------------------------------------------------------------------
Configuration failed because udunits2.h was not found. Try installing:
* deb: libudunits2-dev (Debian, Ubuntu, ...)
* rpm: udunits2-devel (Fedora, EPEL, ...)
* brew: udunits (OSX)
If udunits2 is already installed in a non-standard location, use:
--configure-args='--with-udunits2-lib=/usr/local/lib'
if the library was not found, and/or:
--configure-args='--with-udunits2-include=/usr/include/udunits2'
if the header was not found, replacing paths with appropriate values.
You can alternatively set UDUNITS2_INCLUDE and UDUNITS2_LIBS manually.
--------------------------------------------------------------------------------
See `config.log' for more details
ERROR: configuration failed for package ‘units’
* removing ‘/work/it_css/sw/r/add-ons/r4.1.3/terra/units’
The downloaded source packages are in
‘/tmp/Rtmps25MZ6/downloaded_packages’
Warning message:
In install.packages("units", dependencies = TRUE) :
installation of package ‘units’ had non-zero exit status
> quit()
[(it_css:traine)@login01.caviness ~]$ vpkg_devrequire udunits/2.2
Adding package `udunits/2.2.26` to your environment
[(it_css:traine)@login01.caviness ~]$ env | grep UDUNITS
UDUNITS_PREFIX=/opt/shared/udunits/2.2.26
[(it_css:traine)@login01.caviness ~]$ export UDUNITS2_INCLUDE=$UDUNITS_PREFIX/include
[(it_css:traine)@login01.caviness ~]$ export UDUNITS2_LIBS=$UDUNITS_PREFIX/lib
[(it_css:traine)@login01.caviness ~]$ R -q --no-save
> .libPaths()
[1] "/work/it_css/sw/r/add-ons/r4.1.3/terra"
[2] "/opt/shared/r/4.1.3/lib64/R/library"
> chooseCRANmirror(all)
Secure CRAN mirrors
1: 0-Cloud [https]
2: Australia (Canberra) [https]
3: Australia (Melbourne 1) [https]
4: Australia (Melbourne 2) [https]
5: Australia (Perth) [https]
6: Austria [https]
...
...
69: USA (IA) [https]
70: USA (MI) [https]
71: USA (MO) [https]
72: USA (OH) [https]
73: USA (OR) [https]
74: USA (TN) [https]
75: United Arab Emirates [https]
76: Uruguay [https]
77: (other mirrors)
Selection: 69
> install.packages("units", dependencies=TRUE)
Installing package into ‘/work/it_css/sw/r/add-ons/r4.1.3/terra’
(as ‘lib’ is unspecified)
trying URL 'https://mirror.las.iastate.edu/CRAN/src/contrib/units_0.8-4.tar.gz'
Content type 'application/x-gzip' length 248024 bytes (242 KB)
==================================================
downloaded 242 KB
* installing *source* package ‘units’ ...
** package ‘units’ successfully unpacked and MD5 sums checked
** using staged installation
configure: units: 0.8-4
checking whether the C++ compiler works... yes
checking for C++ compiler default output file name... a.out
checking for suffix of executables...
checking whether we are cross compiling... no
checking for suffix of object files... o
checking whether the compiler supports GNU C++... yes
checking whether /opt/shared/gcc/11.2.0/bin/g++ -std=gnu++14 accepts -g... yes
checking for /opt/shared/gcc/11.2.0/bin/g++ -std=gnu++14 option to enable C++11 features... none needed
checking for stdio.h... yes
checking for stdlib.h... yes
checking for string.h... yes
checking for inttypes.h... yes
checking for stdint.h... yes
checking for strings.h... yes
checking for sys/stat.h... yes
checking for sys/types.h... yes
checking for unistd.h... yes
checking for _Bool... no
checking for stdbool.h that conforms to C99... yes
checking for error_at_line... yes
checking for gcc... /opt/shared/gcc/11.2.0/bin/gcc
checking whether the compiler supports GNU C... yes
checking whether /opt/shared/gcc/11.2.0/bin/gcc accepts -g... yes
checking for /opt/shared/gcc/11.2.0/bin/gcc option to enable C11 features... none needed
checking for XML_ParserCreate in -lexpat... yes
checking for udunits2.h... yes
checking for ut_read_xml in -ludunits2... yes
configure: creating ./config.status
config.status: creating src/Makevars
** libs
/opt/shared/gcc/11.2.0/bin/g++ -std=gnu++14 -I"/opt/shared/r/4.1.3/lib64/R/include" -DNDEBUG -DUDUNITS2_DIR=0 -I/opt/shared/udunits/2.2.26/include -I/opt/shared/udunits/2.2.26/include -I/opt/shared/binutils/2.35/include -I/opt/shared/gcc/11.2.0/include -I/opt/shared/r/4.1.3/include -I'/work/it_css/sw/r/add-ons/r4.1.3/terra/Rcpp/include' -I/opt/shared/binutils/2.35/include -I/opt/shared/gcc/11.2.0/include -I/opt/shared/r/4.1.3/include -fpic -g -O2 -c RcppExports.cpp -o RcppExports.o
/opt/shared/gcc/11.2.0/bin/g++ -std=gnu++14 -I"/opt/shared/r/4.1.3/lib64/R/include" -DNDEBUG -DUDUNITS2_DIR=0 -I/opt/shared/udunits/2.2.26/include -I/opt/shared/udunits/2.2.26/include -I/opt/shared/binutils/2.35/include -I/opt/shared/gcc/11.2.0/include -I/opt/shared/r/4.1.3/include -I'/work/it_css/sw/r/add-ons/r4.1.3/terra/Rcpp/include' -I/opt/shared/binutils/2.35/include -I/opt/shared/gcc/11.2.0/include -I/opt/shared/r/4.1.3/include -fpic -g -O2 -c udunits.cpp -o udunits.o
/opt/shared/gcc/11.2.0/bin/g++ -std=gnu++14 -shared -L/opt/shared/r/4.1.3/lib64/R/lib -L/opt/shared/binutils/2.35/lib -L/opt/shared/gcc/11.2.0/lib -L/opt/shared/gcc/11.2.0/lib64 -I/opt/shared/r/4.1.3/lib64 -o units.so RcppExports.o udunits.o -lexpat -L/opt/shared/udunits/2.2.26/lib -lexpat -ludunits2 -L/opt/shared/r/4.1.3/lib64/R/lib -lR
installing to /work/it_css/sw/r/add-ons/r4.1.3/terra/00LOCK-units/00new/units/libs
** R
** demo
** inst
** byte-compile and prepare package for lazy loading
** help
*** installing help indices
** building package indices
** installing vignettes
** testing if installed package can be loaded from temporary location
** checking absolute paths in shared objects and dynamic libraries
** testing if installed package can be loaded from final location
** testing if installed package keeps a record of temporary installation path
* DONE (units)
> library(terra)
terra 1.7.55
> library(quantregForest)
Loading required package: randomForest
randomForest 4.7-1.1
Type rfNews() to see new features/changes/bug fixes.
Loading required package: RColorBrewer
> quit()
[(it_css:traine)@login01.caviness ~]$
=== Using IT's udbuild environment ===
IT developed a formalization for installing modules called [[abstract:caviness:install_software:install_software|udbuild]]
which can simplify the installation of modules. Here is an example ''udbuild''
script which can be used to install a personal R library.
#!/bin/bash -l
PKGNAME=testing
VERSION=default
UDBUILD_HOME=$WORKDIR/sw
PKG_LIST='
WideLM rpud permGPU magma gputools cudaBayesregData cudaBayesreg
CARramps
'
vpkg_devrequire udbuild r/3.1.1 r-cran/20140905
init_udbuildenv r-addon cuda/6.5
#Sometimes R doesn't properly use CPPFLAGS which is set by VALET, fix that here:
CPATH=$CUDA_PREFIX/include:$CPATH
LIBRARY_PATH=$CUDA_PREFIX/lib64:$CUDA_PREFIX/lib64/stubs:$LIBRARY_PATH
#CRAN_MIRROR='http://cran.cs.wwu.edu/'
CRAN_MIRROR='http://lib.stat.cmu.edu/R/CRAN/'
quote() { printf '"%s", ' "$@" | sed 's/, $/\n/'; }
R -q --no-save <
This script will attempt to build the cuda capable R modules using the
cuda 6.5 version into ''$WORKDIR/sw/r/add-ons/r3.1.1/testing/default-cuda-6.5''.
====== R script in batch ======
==== matmul.R script ====
Consider the simple R script file to multiply a small 3x3 matrix
# Calculate and print small matrix AA'
a <- matrix(1:12,3,4);
a%*%t(a)
Let's test this R script using ''Rscript'' from the command line on a compute node. Don't forget to set your [[abstract:caviness:app_dev:compute_env#using-workgroup-and-directories|workgroup]] to define your cluster group or //investing-entity// compute nodes before you use ''salloc'' to get on a compute node. For example,
workgroup -g it_css
salloc
vpkg_require r/3.5
Rscript matmul.R
The output to the screen:
[,1] [,2] [,3]
[1,] 166 188 210
[2,] 188 214 240
[3,] 210 240 270
To return to the head node, type
exit
==== matmul.qs file ====
To run a R script in batch instead of on the command line has nearly the same steps. Copy a template job submission script (''/opt/shared/templates/slurm/generic/threads.qs'') for example and call it ''matmul.qs''. Now edit it to change the job name and add your commands for your job something like this:
#!/bin/bash -l
#
....
#SBATCH --job-name=matmultiply_R
...
#
# [EDIT] Execute your OpenMP/threaded program using the srun command:
#
# Add vpkg_require commands
vpkg_require r/3.5
# Syntax: Rscript [options] filename.R [arguments]
Rscript matmul.R
Now to run the R script simply submit the job from the head node with the
''sbatch'' command.
sbatch matmul.qs
You should see a notification that your job was submitted. Something like this
Submitted batch job 983119
After the code completes the output of the script will appear in the file
''slurm-983119.out'' because the job number is 983119. Type
more slurm-983119.out
to display the contents of the output file on the screen. For example,
-- OpenMP job setup complete:
-- OMP_THREAD_LIMIT = 2
-- OMP_PROC_BIND = true
-- OMP_PLACES = cores
-- MP_BLIST = 5,17
Adding package `r/3.5.1` to your environment
[,1] [,2] [,3]
[1,] 166 188 210
[2,] 188 214 240
[3,] 210 240 270
====== Using R script in batch array job ======
===== sweep.R file =====
Consider the simple script to print a fraction from the argument list
args <- commandArgs(trailingOnly = TRUE)
# print fraction from argument list
as.numeric(args[1])/as.numeric(args[2])
This is a R script which can be run from the command line on a compute node the commands
salloc
vpkg_require r/3.5
Rscript sweep.R 5 200
The output to the screen:
[1] 0.025
===== sweep.qs file =====
Again copy a template job submission script (/opt/shared/templates/slurm/generic/threads.qs) for example and call it ''sweep.qs''. Now edit it to change the job name, this time adding options for an array job and add your commands for your job something like this:
#!/bin/bash -l
#
....
#SBATCH --job-name=sweep_R
#SBATCH --array=1-200
...
#
# [EDIT] Execute your OpenMP/threaded program using the srun command:
#
## Parameter sweep array job to run the sweep.R with
## lambda = 0,1,2. ... 199
##
# Add vpkg_require commands
vpkg_require r/3.5
date "+Start %s"
echo "Host $HOSTNAME"
let lambda="$SLURM_ARRAY_TASK_ID-1"
let taskCount=200
# Syntax: Rscript [options] filename.R [arguments]
Rscript --vanilla sweep.R $lambda $taskCount
date "+Finish %s"
The ''date'' and ''echo Host'' lines are just a way of keeping track of when and where the jobs are run.
There will be 200 array jobs all running the same script with different parameters (arguments). The ''--vanilla'' option
is used to prevent the multiple jobs from using the same disk space.
To run this in batch you must submit the job from the head node with the
''sbatch'' command.
sbatch sweep.qs
And you see the notification of the job submitted, like this:
Submitted batch job 1170728
After the code completes the output of the script will appear in the files
''slurm-1170728_1.out'' to ''slurm-1170728_200.out''. The number ''1170728'' is the job ID assigned to your job when submitted, and 1 to 200 is the Task ID (e.g. corresponds to the ''--array=1-200'')
If we look specifically at the array job output that maps to our previous example using ''5 200'' which would be ''slurm-1170728_6.out'' we see similar output
-- OpenMP job setup complete:
-- OMP_THREAD_LIMIT = 2
-- OMP_PROC_BIND = true
-- OMP_PLACES = cores
-- MP_BLIST = 30,31
Adding package `r/3.5.1` to your environment
Start 1567531210
Host r00n15.localdomain.hpc.udel.edu
[1] 0.025
Finish 1567531210
You will want to do more than just print out one fraction in your script. The integer parameter can be used for
a one dimensional parameter sweep, to construct unique input and output file names for each task,
or as a seed for the R Random Number Generator (RNG).
==== Writing files from an array job ====
You are running many jobs in the same directory. Grid engine handles the standard output by writing to
separate files with "dot taskid" appended to the jobid. You need to take care of other file output in your R script.
You need to make sure no two of your jobs will write to the same file. Look at your R script to see if you
are writing files. Look for the ''**sink**'' command or any graphics writing commands such as ''**pdf**'' or ''**png**''.
If you are using these R functions, then use a unique file name constructed from the task id.
==== vanilla option ====
The command-line option ''--vanilla'' implies --no-site-file, --no-init-file and --no-environ. This way you will not
be reading or writing to the same files. If you need initialization command, put them in your R script instead of in
in the init-file ''.Rprofile''. If you need some environment variables, export them in your bash script instead of assigning
them in your environ file ''.Renviron''.