Both sides previous revision Previous revision Next revision | Previous revision |
abstract:farber:runjobs:schedule_jobs [2018-10-08 16:07] – [Interactive jobs (qlogin)] anita | abstract:farber:runjobs:schedule_jobs [2021-04-27 16:21] (current) – external edit 127.0.0.1 |
---|
===== Scheduling Jobs on Farber ===== | ====== Scheduling Jobs on Farber ====== |
| |
In order to schedule any job (interactively or batch) on a cluster, you must set your **[[/abstract/farber/app_dev/compute_env#using-workgroup-and-directories|workgroup]]** to define your cluster group or //investing-entity// compute nodes. | In order to schedule any job (interactively or batch) on a cluster, you must set your **[[/abstract/farber/app_dev/compute_env#using-workgroup-and-directories|workgroup]]** to define your cluster group or //investing-entity// compute nodes. |
| |
==== Interactive jobs (qlogin) ==== | ===== Interactive jobs (qlogin) ===== |
| |
As discussed, an //interactive job// allows a user to enter a sequence of commands manually. The following qualify as being interactive jobs: | As discussed, an //interactive job// allows a user to enter a sequence of commands manually. The following qualify as being interactive jobs: |
Use the login (head) node for interactive program development including Fortran, C, and C++ program compilation. Use Grid Engine (**qlogin**) to start interactive shells on your workgroup //investing-entity// compute nodes. | Use the login (head) node for interactive program development including Fortran, C, and C++ program compilation. Use Grid Engine (**qlogin**) to start interactive shells on your workgroup //investing-entity// compute nodes. |
| |
===== Submitting an Interactive Job ===== | ==== Submitting an Interactive Job ==== |
| |
In Grid Engine, interactive jobs are submitted to the job scheduler using the ''qlogin'' command: | In Grid Engine, interactive jobs are submitted to the job scheduler using the ''qlogin'' command: |
| |
| |
====== Batch Jobs (qsub)====== | ===== Batch Jobs (qsub) ===== |
| |
Grid Engine provides the **qsub** command for scheduling batch jobs: | |
| |
^ command ^ Action ^ | |
| ''qsub'' <<//command_line_options//>> <<//job_script//>> | Submit job with script command in the file <<//job_script//>> | | |
| |
For example, | |
| |
qsub myproject.qs | |
| |
or to submit a standby job that waits for idle nodes (up to 240 slots for 8 hours), | |
| |
qsub -l standby=1 myproject.qs | |
| |
or to submit a standby job that waits for idle 48-core nodes (if you are using a cluster with 48-core nodes like farber) | |
| |
qsub -l standby=1 -q standby.q@@48core myproject.qs | |
| |
or to submit a standby job that waits for idle 24-core nodes, (would not be assigned to any 48-core nodes; important for consistency of core assignment) | |
| |
qsub -l standby=1 -q standby.q@@24core myproject.qs | |
| |
or to submit to the four hour standby queue (up to 816 slots spanning all nodes) | |
| |
qsub -l standby=1,h_rt=4:00:00 myproject.qs | |
| |
or to submit to the four hour standby queue spanning just the 24-core nodes. | |
| |
qsub -l standby=1,h_rt=4:00:00 -q standby-4h.q@@24core myproject.qs | |
| |
This file ''myproject.qs'' will contain bash shell commands and **qsub** statements that include **qsub** options and resource specifications. The **qsub** statements begin with #$. | |
| |
<note tip> | |
We strongly recommend that you use a script file that you pattern after the prototypes in **/opt/templates** and save your job script files within a **$WORKDIR** (private work) directory. | |
| |
Reusable job scripts help you maintain a consistent batch environment across runs. The optional **.qs** filename suffix signifies a **q**ueue-**s**ubmission script file. | |
</note> | |
| |
<note important>See also [[:abstract:farber:runjobs:schedule_jobs#resource-management-options-on-farber|resource options]] to specify memory free and/or available, [[abstract:farber:runjobs:queues#farber-exclusive-access|exclusive]] access, and requesting specific [[:software:matlab:matlab#license-information|Matlab licenses]].</note> | |
| |
| |
| |
=== Grid Engine environment variables === | |
| |
In every batch session, Grid Engine sets environment variables that are useful within job scripts. Here are some common examples. The rest appear in the ENVIRONMENTAL VARIABLES section of the **qsub**** man** page.//// | |
| |
^ Environment variable ^ Contains ^ | |
| **HOSTNAME** | Name of the execution (compute) node | | |
| **JOB_ID** | Batch job id assigned by Grid Engine | | |
| **JOB_NAME** | Name you assigned to the batch job (See [[#command-options-for-qsub|Command options for qsub]]) | | |
| **NSLOTS** | Number of //scheduling slots// (processor cores) assigned by Grid Engine to this job | | |
| **SGE_TASK_ID** | Task id of an array job sub-task (See [[#array-jobs|Array jobs]]) | | |
| **TMPDIR** | Name of directory on the (compute) node scratch filesystem | | |
| |
When Grid Engine assigns one of your job's tasks to a particular node, it creates a temporary work directory on that node's 1-2 TB local scratch disk. And when the task assigned to that node is finished, Grid Engine removes the directory and its contents. The form of the directory name is | |
| |
**/scratch/[$JOB_ID].[$SGE_TASK_ID].<<//queue_name//>>** | |
| |
For example after ''qlogin'' type | |
<code bash> | |
echo $TMPDIR | |
</code> | |
to see the name of the node scratch directory for this interactive job. | |
<file> | |
/scratch/71842.1.it_css-qrsh.q | |
</file> | |
| |
See [[:clusters:farber:filesystems|Filesystems]] and [[:general:userguide:04_compute_environ|Computing environment]] for more information about the node scratch filesystem and using environment variables. | |
| |
Grid Engine uses these environment variables' values when creating the job's output files: | |
| |
^ File name patter ^ Description ^ | |
| [$JOB_NAME].o[$JOB_ID] | Default **output** filename | | |
| [$JOB_NAME].e[$JOB_ID] | **error** filename (when not joined to output) | | |
| [$JOB_NAME].po[$JOB_ID] | Parallel job **output** output (Empty for most queues) | | |
| [$JOB_NAME].pe[$JOB_ID] | Parallel job **error** filename (Usually empty) | | |
| |
| |
=== Command options for qsub === | |
| |
The most commonly used **qsub** options fall into two categories: //operational// and //resource-management//. The operational options deal with naming the output files, mail notification of the processing steps, sequencing of a series of jobs, and establishing the UNIX environment. The resource-management options deal with the specific system resources you desire or need, such as parallel programming environments, number of processor cores, maximum CPU time, and virtual memory needed. | |
| |
The table below lists **qsub**'s common operational options. | |
| |
^ Option / Argument ^ Function ^ | |
| ''-N'' <<//job_name//>> | Names the job <//job_name//>. Default: the job script's full filename. | | |
| ''-m'' {b%%|%%e%%|%%a%%|%%s%%|%%n} | Specifies when e-mail notifications of the job's status should be sent: **b**eginning, **e**nd, **a**bort, **s**uspend. Default: **n**ever | | |
| ''-M'' <<//email_address//>> | Specifies the email address to use for notifications. | | |
| ''-j'' {y%%|%%n} | Joins (redirects) the //STDERR// results to //STDOUT//. Default: **y**(//yes//) | | |
| ''-o'' <<//output_file//>> | Directs job output //STDOUT// to <//output_file//>. Default: see [[#grid-engine-environment-variables|Grid Engine environment variables]] | | |
| ''-e'' <<//error_file//>> | Directs job errors (//STDERR//) to <//error_file//>. File is only produced when the **qsub** option **–j n** is used. | | |
| ''-hold_jid'' <//job_list//> | Holds job until the jobs named in <//job_list//> are completed. Job may be listed as a list of comma-separated numeric job ids or job names. | | |
| ''-t'' <<//task_id_range//>> | Used for //array// jobs. See [[#array-jobs|Array jobs]] for details. | | |
^ Special notes for IT clusters: ^^ | |
| ''-cwd'' | Default. Uses current directory as the job's working directory. | | |
| ''-V'' | Ignored. Generally, the login node's environment is not appropriate to pass to a compute node. Instead, you must define the environment variables directly in the job script. | | |
| ''-q'' <<//queue_name//>> | Not need in most cases. Your choice of resource-management options determine the queue. | | |
^ The resource-management options for ''qsub'' have two common forms: ^^ | |
| ''-l'' <<//resource//>>''=''<<//value//>> || | |
| ''-pe ''<<//parallel_environment//>> <<//Nproc//>> || | |
| |
For example, putting the lines | |
<file> | |
#$ -l h_cpu=1:30:00 | |
#$ –pe threads 12 | |
</file> | |
in the job script tells Grid Engine to set a hard limit of 1.5 hours on the CPU time resource for the job, and to assign 12 processors for your job. | |
| |
Grid Engine tries to satisfy all of the resource-management options you specify in a job script or as qsub command-line options. If there is a queue already defined that accepts jobs having that particular combination of requests, Grid Engine assigns your job to that queue. | |
| |
| |
| |
====== Batch Jobs ====== | |
| |
Prerequisite to the submission of //batch jobs// to the job scheduler is the writing of a //job script//. Grid Engine job scripts follow the same form as shell scripts, with a few exceptions: | Prerequisite to the submission of //batch jobs// to the job scheduler is the writing of a //job script//. Grid Engine job scripts follow the same form as shell scripts, with a few exceptions: |
</file> | </file> |
| |
===== Submitting the Job ===== | ==== Submitting a Batch Job ==== |
| |
Batch jobs are submitted to the job scheduler using the ''qsub'' command: | Grid Engine provides the **qsub** command for scheduling batch jobs: |
| |
| ^ command ^ Action ^ |
| | ''qsub'' <<//command_line_options//>> <<//job_script//>> | Submit job with script command in the file <<//job_script//>> | |
| |
| For example, |
| |
<code bash> | <code bash> |
where the argument to the ''-a'' option is in the form ''YYYYMMDDHHmm'' (year, month, day, hour, minute). | where the argument to the ''-a'' option is in the form ''YYYYMMDDHHmm'' (year, month, day, hour, minute). |
| |
===== Job Output ===== | ==== Job Output ==== |
| |
Equally as important as executing the job is capturing any output produced by the job. As mentioned above, the ''-j y'' option sends all output (stdout and stderr) to a single file. By default, that output file is named according to the formula | Equally as important as executing the job is capturing any output produced by the job. As mentioned above, the ''-j y'' option sends all output (stdout and stderr) to a single file. By default, that output file is named according to the formula |
<note>If the user overrides the default joining of regular and error output to a single file (using ''-y n''), the error output is directed to a file named as described above but with a ''.e[job id]'' suffix. Likewise, an explicit filename can be provided using the ''-e'' option.</note> | <note>If the user overrides the default joining of regular and error output to a single file (using ''-y n''), the error output is directed to a file named as described above but with a ''.e[job id]'' suffix. Likewise, an explicit filename can be provided using the ''-e'' option.</note> |
| |
===== Forgetting the Filename ===== | ==== Forgetting the Filename ==== |
| |
A user may mistakenly omit the script filename from the ''qsub'' command. Surprisingly, ''qsub'' does not complain in such a situation; instead, it pauses and allows the user to type a script: | A user may mistakenly omit the script filename from the ''qsub'' command. Surprisingly, ''qsub'' does not complain in such a situation; instead, it pauses and allows the user to type a script: |
The "''^D''" represents holding down the "control" key and pressing the "D" key; this signals "end of file" and lets ''qsub'' know that the user is done entering lines of text. By default, a batch job submitted in this fashion will be named "''STDIN''". | The "''^D''" represents holding down the "control" key and pressing the "D" key; this signals "end of file" and lets ''qsub'' know that the user is done entering lines of text. By default, a batch job submitted in this fashion will be named "''STDIN''". |
| |
| ===== More details about using qsub ===== |
| |
| For example, |
| |
| qsub myproject.qs |
| |
| or to submit a standby job that waits for idle nodes (up to 240 slots for 8 hours), |
| |
| qsub -l standby=1 myproject.qs |
| |
| or to submit a standby job that waits for idle 48-core nodes (if you are using a cluster with 48-core nodes like farber) |
| |
| qsub -l standby=1 -q standby.q@@48core myproject.qs |
| |
| or to submit a standby job that waits for idle 24-core nodes, (would not be assigned to any 48-core nodes; important for consistency of core assignment) |
| |
| qsub -l standby=1 -q standby.q@@24core myproject.qs |
| |
| or to submit to the four hour standby queue (up to 816 slots spanning all nodes) |
| |
| qsub -l standby=1,h_rt=4:00:00 myproject.qs |
| |
| or to submit to the four hour standby queue spanning just the 24-core nodes. |
| |
| qsub -l standby=1,h_rt=4:00:00 -q standby-4h.q@@24core myproject.qs |
| |
| This file ''myproject.qs'' will contain bash shell commands and **qsub** statements that include **qsub** options and resource specifications. The **qsub** statements begin with #$. |
| |
| <note tip> |
| We strongly recommend that you use a script file that you pattern after the prototypes in **/opt/shared/templates** and save your job script files within a **$WORKDIR** (private work) directory. |
| |
| Reusable job scripts help you maintain a consistent batch environment across runs. The optional **.qs** filename suffix signifies a **q**ueue-**s**ubmission script file. |
| </note> |
| |
| <note important>See also [[:abstract:farber:runjobs:schedule_jobs#resource-management-options-on-farber|resource options]] to specify memory free and/or available, [[abstract:farber:runjobs:queues#farber-exclusive-access|exclusive]] access, and requesting specific [[:software:matlab:matlab#license-information|Matlab licenses]].</note> |
| |
| ==== Grid Engine environment variables ==== |
| |
| In every batch session, Grid Engine sets environment variables that are useful within job scripts. Here are some common examples. The rest appear in the ENVIRONMENTAL VARIABLES section of the **qsub**** man** page.//// |
| |
| ^ Environment variable ^ Contains ^ |
| | **HOSTNAME** | Name of the execution (compute) node | |
| | **JOB_ID** | Batch job id assigned by Grid Engine | |
| | **JOB_NAME** | Name you assigned to the batch job (See [[#command-options-for-qsub|Command options for qsub]]) | |
| | **NSLOTS** | Number of //scheduling slots// (processor cores) assigned by Grid Engine to this job | |
| | **SGE_TASK_ID** | Task id of an array job sub-task (See [[#array-jobs|Array jobs]]) | |
| | **TMPDIR** | Name of directory on the (compute) node scratch filesystem | |
| |
| When Grid Engine assigns one of your job's tasks to a particular node, it creates a temporary work directory on that node's 1-2 TB local scratch disk. And when the task assigned to that node is finished, Grid Engine removes the directory and its contents. The form of the directory name is |
| |
| **/scratch/[$JOB_ID].[$SGE_TASK_ID].<<//queue_name//>>** |
| |
| For example after ''qlogin'' type |
| <code bash> |
| echo $TMPDIR |
| </code> |
| to see the name of the node scratch directory for this interactive job. |
| <file> |
| /scratch/71842.1.it_css-qrsh.q |
| </file> |
| |
| See [[:clusters:farber:filesystems|Filesystems]] and [[:abstract:farber:app_dev:compute_env|Computing environment]] for more information about the node scratch filesystem and using environment variables. |
| |
| Grid Engine uses these environment variables' values when creating the job's output files: |
| |
| ^ File name patter ^ Description ^ |
| | [$JOB_NAME].o[$JOB_ID] | Default **output** filename | |
| | [$JOB_NAME].e[$JOB_ID] | **error** filename (when not joined to output) | |
| | [$JOB_NAME].po[$JOB_ID] | Parallel job **output** output (Empty for most queues) | |
| | [$JOB_NAME].pe[$JOB_ID] | Parallel job **error** filename (Usually empty) | |
| |
| ==== More options for qsub ==== |
| |
| The most commonly used **qsub** options fall into two categories: //operational// and //resource-management//. The operational options deal with naming the output files, mail notification of the processing steps, sequencing of a series of jobs, and establishing the UNIX environment. The resource-management options deal with the specific system resources you desire or need, such as parallel programming environments, number of processor cores, maximum CPU time, and virtual memory needed. |
| |
| The table below lists **qsub**'s common operational options. |
| |
| ^ Option / Argument ^ Function ^ |
| | ''-N'' <<//job_name//>> | Names the job <//job_name//>. Default: the job script's full filename. | |
| | ''-m'' {b%%|%%e%%|%%a%%|%%s%%|%%n} | Specifies when e-mail notifications of the job's status should be sent: **b**eginning, **e**nd, **a**bort, **s**uspend. Default: **n**ever | |
| | ''-M'' <<//email_address//>> | Specifies the email address to use for notifications. | |
| | ''-j'' {y%%|%%n} | Joins (redirects) the //STDERR// results to //STDOUT//. Default: **y**(//yes//) | |
| | ''-o'' <<//output_file//>> | Directs job output //STDOUT// to <//output_file//>. Default: see [[#grid-engine-environment-variables|Grid Engine environment variables]] | |
| | ''-e'' <<//error_file//>> | Directs job errors (//STDERR//) to <//error_file//>. File is only produced when the **qsub** option **–j n** is used. | |
| | ''-hold_jid'' <//job_list//> | Holds job until the jobs named in <//job_list//> are completed. Job may be listed as a list of comma-separated numeric job ids or job names. | |
| | ''-t'' <<//task_id_range//>> | Used for //array// jobs. See [[#array-jobs|Array jobs]] for details. | |
| ^ Special notes for IT clusters: ^^ |
| | ''-cwd'' | Default. Uses current directory as the job's working directory. | |
| | ''-V'' | Ignored. Generally, the login node's environment is not appropriate to pass to a compute node. Instead, you must define the environment variables directly in the job script. | |
| | ''-q'' <<//queue_name//>> | Not need in most cases. Your choice of resource-management options determine the queue. | |
| ^ The resource-management options for ''qsub'' have two common forms: ^^ |
| | ''-l'' <<//resource//>>''=''<<//value//>> || |
| | ''-pe ''<<//parallel_environment//>> <<//Nproc//>> || |
| |
| For example, putting the lines |
| <file> |
| #$ -l h_cpu=1:30:00 |
| #$ –pe threads 12 |
| </file> |
| in the job script tells Grid Engine to set a hard limit of 1.5 hours on the CPU time resource for the job, and to assign 12 processors for your job. |
| |
| Grid Engine tries to satisfy all of the resource-management options you specify in a job script or as qsub command-line options. If there is a queue already defined that accepts jobs having that particular combination of requests, Grid Engine assigns your job to that queue. |
| |
===== Resource-management options on Farber ===== | ===== Resource-management options on Farber ===== |
For memory you will be concerned about how much is free. Memory resources come as both consumable and sensor driven (not consumable). For example: | For memory you will be concerned about how much is free. Memory resources come as both consumable and sensor driven (not consumable). For example: |
^ memory resource ^ Consumable ^ Explanation ^ | ^ memory resource ^ Consumable ^ Explanation ^ |
| mem_free | No |Memory that must be available BEFORE job can start | | | m_mem_free | Yes |Memory consumed per CPU DURING execution | |
| m_mem_free | Yes |Memory consumed by the job DURING execution | | |
| |
It is usually a good idea to add both resources. The ''mem_free'' complex is sensor driven, and is more reliable for choosing a node for your job. The ''m_mem_free'' is consumable, which means you are reserving the memory for future use. Other jobs, using ''m_mem_free'', may be barred from starting on the node. If you are specifying memory resources for a parallel environment job, the requested memory is multiplied by the slot count. By default, ''m_mem_free'' is defined as 1GB of memory per core (slot), if not specified. | The ''m_mem_free'' is consumable, which means you are reserving the memory for future use. Other jobs, using ''m_mem_free'', may be barred from starting on the node. If you are specifying memory resources for a parallel environment job, the requested memory is multiplied by the slot count. By default, ''m_mem_free'' is defined as 1GB of memory per core (slot), if not specified. |
| |
<note tip>When using a shared memory parallel computing environment ''-pe threads'', divide the total memory needed by the number of slots. For example, to request 48G of shared memory for an 8 thread job, request 6G (6G per slot).</note> | <note tip>When using a shared memory parallel computing environment ''-pe threads'', divide the total memory needed by the number of slots. For example, to request 48G of shared memory for an 8 thread job, request 6G (6G per slot) i.e.,'-l m_mem_free=6G'</note> |
| |
<note warning>Please note a job error will occur and prevent the queue from accepting jobs when: | <note warning>Please note a job error will occur and prevent the queue from accepting jobs when: |
| |
<code> | <code> |
qsub -l mem_free=20G,m_mem_free=20G -t 1-30 myjob.qs | qsub -l m_mem_free=20G -t 1-30 myjob.qs |
</code> | </code> |
| |
This will submit 30 jobs to the queue, with the SGE_TASK_ID variable set for use in the ''myjobs.qs'' script (an [[general:userguide:06_runtime_environ?&#array-jobs|array job]].) | This will submit 30 jobs to the queue, with the SGE_TASK_ID variable set for use in the ''myjobs.qs'' script (an [[:abstract:farber:runjobs:schedule_jobs#array-jobs|array job]].) |
The ''mem_free'' resource will cause Grid Engine to find a node (or wait until one is available) with 20 Gbytes of memory free. | The ''m_mem_free'' resource will tell Grid Engine to not schedule a job on a node unless the specified amount of memory i.e., 20GB per CPU is available to consume on that node. Since this is a serial job that runs on a single CPU,20GB can be termed as total memory available for the job. |
The ''m_mem_free'' resource will tell Grid Engine to not schedule a job on a node unless the specified amount of memory is available to consume on that node. | |
==== Parallel environments ==== | ==== Parallel environments ==== |
| |
The ''/opt/templates/gridengine'' directory contains basic prototype job scripts for non-interactive parallel jobs. This section describes the **–pe** parallel environment option that's required for MPI jobs, openMP jobs and other jobs that use the SMP (threads) programming model. | The ''/opt/shared/templates/gridengine'' directory contains basic prototype job scripts for non-interactive parallel jobs. This section describes the **–pe** parallel environment option that's required for MPI jobs, openMP jobs and other jobs that use the SMP (threads) programming model. |
| |
Type the command: | Type the command: |
=== The threads parallel environment === | === The threads parallel environment === |
| |
Jobs such as those having openMP directives use the **//threads//** parallel environment, an implementation of the [[:general:userguide:05_development#programming-models|shared-memory programming model]]. These SMP jobs can only use the cores on a **single** node. | Jobs such as those having openMP directives use the **//threads//** parallel environment, an implementation of the [[:abstract:farber:app_dev:prog_env#programming-models|shared-memory programming model]]. These SMP jobs can only use the cores on a **single** node. |
| |
For example, if your group only owns nodes with 24 cores, then your ''–pe threads'' request may only ask for 24 or fewer slots. Use Grid Engine's **qconf** command to determine the names and characteristics of the queues and compute nodes available to your investing-entity group on a cluster. | For example, if your group only owns nodes with 24 cores, then your ''–pe threads'' request may only ask for 24 or fewer slots. Use Grid Engine's **qconf** command to determine the names and characteristics of the queues and compute nodes available to your investing-entity group on a cluster. |
| |
<note tip> | <note tip> |
IT provides a job script template called ''openmp.qs'' available in ''/opt/templates/gridengine/openmp'' to copy and customize for your OpenMP jobs. | IT provides a job script template called ''openmp.qs'' available in ''/opt/shared/templates/gridengine/openmp'' to copy and customize for your OpenMP jobs. |
</note> | </note> |
| |
MPI jobs inherently generate considerable network traffic among the processor cores of a cluster's compute nodes. The processors on the compute node may be connected by two types of networks: InfiniBand and Gigabit Ethernet. | MPI jobs inherently generate considerable network traffic among the processor cores of a cluster's compute nodes. The processors on the compute node may be connected by two types of networks: InfiniBand and Gigabit Ethernet. |
| |
IT has developed templates to help with the **openmpi** parallel environments for Farber, targeting different user needs and architecture. You can copy the templates from ''/opt/templates/gridengine/openmpi'' and customize them. These templates are essentially identical with the exception of the presence or absence of certain **qsub** options and the values assigned to **MPI_FLAGS** based on using particular environment variables. In all cases, the parallel environment option must be specified: | IT has developed templates to help with the **openmpi** parallel environments for Farber, targeting different user needs and architecture. You can copy the templates from ''/opt/shared/templates/gridengine/openmpi'' and customize them. These templates are essentially identical with the exception of the presence or absence of certain **qsub** options and the values assigned to **MPI_FLAGS** based on using particular environment variables. In all cases, the parallel environment option must be specified: |
| |
''-pe mpi'' <<//NPROC//>> | ''-pe mpi'' <<//NPROC//>> |
| |
<note tip> | <note tip> |
IT provides several job script templates in ''/opt/templates/gridengine/openmpi'' to copy and customize for your Open MPI jobs. See [[software:openmpi:farber|Open MPI on Farber]] for more details about these job scripts. | IT provides several job script templates in ''/opt/shared/templates/gridengine/openmpi'' to copy and customize for your Open MPI jobs. See [[software:openmpi:farber|Open MPI on Farber]] for more details about these job scripts. |
</note> | </note> |
| |
==== Job Templates ==== | ==== Job Templates ==== |
| |
Detailed information pertaining to individual kinds of parallel jobs -- like setting the ''OMP_NUM_THREADS'' environment variable to ''$NSLOTS'' for OpenMP programs -- are provided by UD IT in a collection of job template scripts on a per-cluster basis under the ''/opt/templates'' directory. For example, on farber this directory looks like: | Detailed information pertaining to individual kinds of parallel jobs -- like setting the ''OMP_NUM_THREADS'' environment variable to ''$NSLOTS'' for OpenMP programs -- are provided by UD IT in a collection of job template scripts on a per-cluster basis under the ''/opt/shared/templates'' directory. For example, on farber this directory looks like: |
| |
<code bash> | <code bash> |
[(it_css:traine)@farber ~]$ ls -l /opt/templates | [(it_css:traine)@farber ~]$ ls -l /opt/shared/templates |
total 4 | total 4 |
drwxr-sr-x 7 frey _sgeadm 104 Jul 17 08:11 dev-projects | drwxr-sr-x 7 frey _sgeadm 104 Jul 17 08:11 dev-projects |
==== Array jobs ==== | ==== Array jobs ==== |
| |
An [[:general:jobsched:grid-engine:35_parallelism#array-jobs|array job]] essentially runs the same job by generating a new repeated task many times. Each time, the environment variable **SGE_TASK_ID** is set to a sequence number by Grid Engine and its value provides input to the job submission script. | An array job essentially runs the same job by generating a new repeated task many times. Each time, the environment variable **SGE_TASK_ID** is set to a sequence number by Grid Engine and its value provides input to the job submission script. |
| |
<note tip> | <note tip> |