Differences
This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision | ||
software:r:farber [2018-04-26 13:23] – [personal/program specific R libraries and extensions] sraskar | software:r:farber [2021-03-17 14:44] (current) – [matmul.qs file] anita | ||
---|---|---|---|
Line 288: | Line 288: | ||
cuda 6.5 version into '' | cuda 6.5 version into '' | ||
+ | ====== R script in batch ====== | ||
+ | |||
+ | ==== matmul.R script ==== | ||
+ | |||
+ | Consider the simple R script file to multiply a small 3x3 matrix | ||
+ | |||
+ | <file R matmul.R> | ||
+ | # Calculate and print small matrix AA' | ||
+ | a <- matrix(1: | ||
+ | a%*%t(a) | ||
+ | </ | ||
+ | |||
+ | Let's test this R script using '' | ||
+ | |||
+ | <code bash> | ||
+ | workgroup -g it_css | ||
+ | qlogin | ||
+ | vpkg_require r/3 | ||
+ | Rscript matmul.R | ||
+ | </ | ||
+ | |||
+ | The output to the screen: | ||
+ | |||
+ | < | ||
+ | [,1] [,2] [,3] | ||
+ | [1,] 166 188 210 | ||
+ | [2,] 188 214 240 | ||
+ | [3,] 210 240 270 | ||
+ | </ | ||
+ | |||
+ | To return to the head node, type | ||
+ | <code bash> | ||
+ | exit | ||
+ | </ | ||
+ | |||
+ | ==== matmul.qs file ==== | ||
+ | |||
+ | To run a R script in batch instead of on the command line has nearly the same steps. | ||
+ | Consider the queue submission script file: | ||
+ | |||
+ | <file bash matmul.qs> | ||
+ | #$ -N matmultiply | ||
+ | |||
+ | # Add vpkg_require commands after this line: | ||
+ | vpkg_require r/3 | ||
+ | |||
+ | # Syntax: Rscript [options] filename.R [arguments] | ||
+ | Rscript matmul.R | ||
+ | </ | ||
+ | |||
+ | Now to run the R script simply submit the job from the head node with the | ||
+ | '' | ||
+ | |||
+ | < | ||
+ | qsub matmul.qs | ||
+ | </ | ||
+ | |||
+ | You should see a notification that your job was submitted. | ||
+ | |||
+ | <code bash> | ||
+ | Your job 2283886 (" | ||
+ | </ | ||
+ | |||
+ | After the code completes the output of the script will appear in the file | ||
+ | '' | ||
+ | |||
+ | < | ||
+ | more matmultiply.o2283886 | ||
+ | </ | ||
+ | |||
+ | to display the contents of the output file on the screen. | ||
+ | |||
+ | < | ||
+ | Adding dependency `x11/ | ||
+ | Adding package `r/3.0.2` to your environment | ||
+ | [,1] [,2] [,3] | ||
+ | [1,] 166 188 210 | ||
+ | [2,] 188 214 240 | ||
+ | [3,] 210 240 270 | ||
+ | </ | ||
+ | |||
+ | ====== Using R script in batch array job ====== | ||
+ | ===== sweep.R file ===== | ||
+ | |||
+ | Consider the simple script to print a fraction from the argument list | ||
+ | |||
+ | <file R sweep.R> | ||
+ | args <- commandArgs(trailingOnly = TRUE) | ||
+ | # print fraction from argument list | ||
+ | as.numeric(args[1])/ | ||
+ | </ | ||
+ | |||
+ | This is a R script with can be run from the command line on a compute node the commands | ||
+ | |||
+ | <code bash> | ||
+ | qlogin | ||
+ | vpkg_require r/3 | ||
+ | Rscript sweep.R 5 200 | ||
+ | </ | ||
+ | |||
+ | The output to the screen: | ||
+ | < | ||
+ | [1] 0.025 | ||
+ | </ | ||
+ | |||
+ | ===== sweep.qs file ===== | ||
+ | |||
+ | Consider the queue script file | ||
+ | |||
+ | <file bash sweep.qs> | ||
+ | #$ -N sweep | ||
+ | #$ -t 1-200 | ||
+ | ## | ||
+ | ## Parameter sweep array job to run the sweep.R | ||
+ | ## lambda = 0,1,2. ... 199 | ||
+ | ## | ||
+ | |||
+ | # Add vpkg_require commands after this line: | ||
+ | vpkg_require r/3 | ||
+ | |||
+ | date " | ||
+ | echo "Host $HOSTNAME" | ||
+ | |||
+ | let lambda=" | ||
+ | let taskCount=200 | ||
+ | |||
+ | # Syntax: Rscript [options] filename.R [arguments] | ||
+ | Rscript --vanilla sweep.R $lambda $taskCount | ||
+ | |||
+ | date " | ||
+ | </ | ||
+ | |||
+ | The '' | ||
+ | There will be 200 array jobs all running the same script with different parameters (arguments). | ||
+ | is used to prevent the multiple jobs from using the same disk space. | ||
+ | |||
+ | To run this in batch you must submit the job from the head node with the | ||
+ | '' | ||
+ | |||
+ | < | ||
+ | qsub sweep.qs | ||
+ | </ | ||
+ | |||
+ | After the code completes the output of the script will appear in the files | ||
+ | '' | ||
+ | |||
+ | < | ||
+ | Adding dependency `x11/ | ||
+ | Adding package `r/3.0.2` to your environment | ||
+ | [1] 0.025 | ||
+ | </ | ||
+ | <note tip> | ||
+ | You will want to do more than just print out one fraction in your script. | ||
+ | a one dimensional parameter sweep, to construct unique input and output file names for each task, | ||
+ | or as a seed for the R Random Number Generator (RNG).</ | ||
+ | |||
+ | ==== Writing files from an array job ==== | ||
+ | |||
+ | You are running many jobs in the same directory. | ||
+ | separate files with "dot taskid" | ||
+ | |||
+ | <note important> | ||
+ | You need to make sure no two of your jobs will write to the same file. Look at your R script to see if you | ||
+ | are writing files. | ||
+ | If you are using these R functions, then use a unique file name constructed from the task id. | ||
+ | </ | ||
+ | |||
+ | ==== vanilla option ==== | ||
+ | |||
+ | The command-line option '' | ||
+ | be reading or writing to the same files. | ||
+ | in the init-file '' | ||
+ | them in your environ file '' | ||