Differences
This shows you the differences between two versions of the page.
Next revision | Previous revision | ||
software:r:farber [2017-10-23 18:05] – created sraskar | software:r:farber [2021-03-17 14:44] (current) – [matmul.qs file] anita | ||
---|---|---|---|
Line 27: | Line 27: | ||
for use. | for use. | ||
- | ^r-cran | + | ^r-cran |
- | without any additional dependencies. | + | ^r-bioconductor |The full suite of[[http:// |
- | | + | ^r-fftw |
- | ^r-bioconductor |The full suite of \ | + | ^r-gsl |
- | [[http:// | + | ^r-gdal |
- | ^r-fftw | + | ^r-jags |
- | ^r-gsl | + | ^r-mpi |
- | GLPK(GNU Linear Programming Kit), or MPFR(GNU MPFR Library) | + | ^r-netcdf |
- | ^r-gdal | + | ^r-all |
- | Library) and GEOS(Geometry Engine, Open Source) | + | ^r-cuda |
- | ^r-jags | + | |
- | the r-gsl library mentioned above. | + | |
- | ^r-mpi | + | |
- | computing. | + | |
- | ^r-netcdf | + | |
- | libraries. | + | |
- | ^r-all | + | |
- | and CRAN module with multiple dependencies from the above \ | + | |
- | | + | |
- | ^r-cuda | + | |
=== Searching for modules === | === Searching for modules === | ||
Line 255: | Line 245: | ||
=== Using IT's udbuild environment === | === Using IT's udbuild environment === | ||
- | IT developed a formalization for installing modules called [[farber:udbuild]] | + | IT developed a formalization for installing modules called [[/abstract/farber/ |
which can simplify the installation of modules. | which can simplify the installation of modules. | ||
script which can be used to install a personal R library. | script which can be used to install a personal R library. | ||
Line 298: | Line 288: | ||
cuda 6.5 version into '' | cuda 6.5 version into '' | ||
+ | ====== R script in batch ====== | ||
+ | |||
+ | ==== matmul.R script ==== | ||
+ | |||
+ | Consider the simple R script file to multiply a small 3x3 matrix | ||
+ | |||
+ | <file R matmul.R> | ||
+ | # Calculate and print small matrix AA' | ||
+ | a <- matrix(1: | ||
+ | a%*%t(a) | ||
+ | </ | ||
+ | |||
+ | Let's test this R script using '' | ||
+ | |||
+ | <code bash> | ||
+ | workgroup -g it_css | ||
+ | qlogin | ||
+ | vpkg_require r/3 | ||
+ | Rscript matmul.R | ||
+ | </ | ||
+ | |||
+ | The output to the screen: | ||
+ | |||
+ | < | ||
+ | [,1] [,2] [,3] | ||
+ | [1,] 166 188 210 | ||
+ | [2,] 188 214 240 | ||
+ | [3,] 210 240 270 | ||
+ | </ | ||
+ | |||
+ | To return to the head node, type | ||
+ | <code bash> | ||
+ | exit | ||
+ | </ | ||
+ | |||
+ | ==== matmul.qs file ==== | ||
+ | |||
+ | To run a R script in batch instead of on the command line has nearly the same steps. | ||
+ | Consider the queue submission script file: | ||
+ | |||
+ | <file bash matmul.qs> | ||
+ | #$ -N matmultiply | ||
+ | |||
+ | # Add vpkg_require commands after this line: | ||
+ | vpkg_require r/3 | ||
+ | |||
+ | # Syntax: Rscript [options] filename.R [arguments] | ||
+ | Rscript matmul.R | ||
+ | </ | ||
+ | |||
+ | Now to run the R script simply submit the job from the head node with the | ||
+ | '' | ||
+ | |||
+ | < | ||
+ | qsub matmul.qs | ||
+ | </ | ||
+ | |||
+ | You should see a notification that your job was submitted. | ||
+ | |||
+ | <code bash> | ||
+ | Your job 2283886 (" | ||
+ | </ | ||
+ | |||
+ | After the code completes the output of the script will appear in the file | ||
+ | '' | ||
+ | |||
+ | < | ||
+ | more matmultiply.o2283886 | ||
+ | </ | ||
+ | |||
+ | to display the contents of the output file on the screen. | ||
+ | |||
+ | < | ||
+ | Adding dependency `x11/ | ||
+ | Adding package `r/3.0.2` to your environment | ||
+ | [,1] [,2] [,3] | ||
+ | [1,] 166 188 210 | ||
+ | [2,] 188 214 240 | ||
+ | [3,] 210 240 270 | ||
+ | </ | ||
+ | |||
+ | ====== Using R script in batch array job ====== | ||
+ | ===== sweep.R file ===== | ||
+ | |||
+ | Consider the simple script to print a fraction from the argument list | ||
+ | |||
+ | <file R sweep.R> | ||
+ | args <- commandArgs(trailingOnly = TRUE) | ||
+ | # print fraction from argument list | ||
+ | as.numeric(args[1])/ | ||
+ | </ | ||
+ | |||
+ | This is a R script with can be run from the command line on a compute node the commands | ||
+ | |||
+ | <code bash> | ||
+ | qlogin | ||
+ | vpkg_require r/3 | ||
+ | Rscript sweep.R 5 200 | ||
+ | </ | ||
+ | |||
+ | The output to the screen: | ||
+ | < | ||
+ | [1] 0.025 | ||
+ | </ | ||
+ | |||
+ | ===== sweep.qs file ===== | ||
+ | |||
+ | Consider the queue script file | ||
+ | |||
+ | <file bash sweep.qs> | ||
+ | #$ -N sweep | ||
+ | #$ -t 1-200 | ||
+ | ## | ||
+ | ## Parameter sweep array job to run the sweep.R | ||
+ | ## lambda = 0,1,2. ... 199 | ||
+ | ## | ||
+ | |||
+ | # Add vpkg_require commands after this line: | ||
+ | vpkg_require r/3 | ||
+ | |||
+ | date " | ||
+ | echo "Host $HOSTNAME" | ||
+ | |||
+ | let lambda=" | ||
+ | let taskCount=200 | ||
+ | |||
+ | # Syntax: Rscript [options] filename.R [arguments] | ||
+ | Rscript --vanilla sweep.R $lambda $taskCount | ||
+ | |||
+ | date " | ||
+ | </ | ||
+ | |||
+ | The '' | ||
+ | There will be 200 array jobs all running the same script with different parameters (arguments). | ||
+ | is used to prevent the multiple jobs from using the same disk space. | ||
+ | |||
+ | To run this in batch you must submit the job from the head node with the | ||
+ | '' | ||
+ | |||
+ | < | ||
+ | qsub sweep.qs | ||
+ | </ | ||
+ | |||
+ | After the code completes the output of the script will appear in the files | ||
+ | '' | ||
+ | |||
+ | < | ||
+ | Adding dependency `x11/ | ||
+ | Adding package `r/3.0.2` to your environment | ||
+ | [1] 0.025 | ||
+ | </ | ||
+ | <note tip> | ||
+ | You will want to do more than just print out one fraction in your script. | ||
+ | a one dimensional parameter sweep, to construct unique input and output file names for each task, | ||
+ | or as a seed for the R Random Number Generator (RNG).</ | ||
+ | |||
+ | ==== Writing files from an array job ==== | ||
+ | |||
+ | You are running many jobs in the same directory. | ||
+ | separate files with "dot taskid" | ||
+ | |||
+ | <note important> | ||
+ | You need to make sure no two of your jobs will write to the same file. Look at your R script to see if you | ||
+ | are writing files. | ||
+ | If you are using these R functions, then use a unique file name constructed from the task id. | ||
+ | </ | ||
+ | |||
+ | ==== vanilla option ==== | ||
+ | |||
+ | The command-line option '' | ||
+ | be reading or writing to the same files. | ||
+ | in the init-file '' | ||
+ | them in your environ file '' | ||