Both sides previous revision Previous revision Next revision | Previous revision |
abstract:farber:runjobs:schedule_jobs [2018-10-09 12:06] – [Interactive jobs (qlogin)] anita | abstract:farber:runjobs:schedule_jobs [2021-04-27 16:21] (current) – external edit 127.0.0.1 |
---|
===== Scheduling Jobs on Farber ===== | ====== Scheduling Jobs on Farber ====== |
| |
In order to schedule any job (interactively or batch) on a cluster, you must set your **[[/abstract/farber/app_dev/compute_env#using-workgroup-and-directories|workgroup]]** to define your cluster group or //investing-entity// compute nodes. | In order to schedule any job (interactively or batch) on a cluster, you must set your **[[/abstract/farber/app_dev/compute_env#using-workgroup-and-directories|workgroup]]** to define your cluster group or //investing-entity// compute nodes. |
where the argument to the ''-a'' option is in the form ''YYYYMMDDHHmm'' (year, month, day, hour, minute). | where the argument to the ''-a'' option is in the form ''YYYYMMDDHHmm'' (year, month, day, hour, minute). |
| |
===== Job Output ===== | ==== Job Output ==== |
| |
Equally as important as executing the job is capturing any output produced by the job. As mentioned above, the ''-j y'' option sends all output (stdout and stderr) to a single file. By default, that output file is named according to the formula | Equally as important as executing the job is capturing any output produced by the job. As mentioned above, the ''-j y'' option sends all output (stdout and stderr) to a single file. By default, that output file is named according to the formula |
<note>If the user overrides the default joining of regular and error output to a single file (using ''-y n''), the error output is directed to a file named as described above but with a ''.e[job id]'' suffix. Likewise, an explicit filename can be provided using the ''-e'' option.</note> | <note>If the user overrides the default joining of regular and error output to a single file (using ''-y n''), the error output is directed to a file named as described above but with a ''.e[job id]'' suffix. Likewise, an explicit filename can be provided using the ''-e'' option.</note> |
| |
===== Forgetting the Filename ===== | ==== Forgetting the Filename ==== |
| |
A user may mistakenly omit the script filename from the ''qsub'' command. Surprisingly, ''qsub'' does not complain in such a situation; instead, it pauses and allows the user to type a script: | A user may mistakenly omit the script filename from the ''qsub'' command. Surprisingly, ''qsub'' does not complain in such a situation; instead, it pauses and allows the user to type a script: |
| |
<note tip> | <note tip> |
We strongly recommend that you use a script file that you pattern after the prototypes in **/opt/templates** and save your job script files within a **$WORKDIR** (private work) directory. | We strongly recommend that you use a script file that you pattern after the prototypes in **/opt/shared/templates** and save your job script files within a **$WORKDIR** (private work) directory. |
| |
Reusable job scripts help you maintain a consistent batch environment across runs. The optional **.qs** filename suffix signifies a **q**ueue-**s**ubmission script file. | Reusable job scripts help you maintain a consistent batch environment across runs. The optional **.qs** filename suffix signifies a **q**ueue-**s**ubmission script file. |
| [$JOB_NAME].pe[$JOB_ID] | Parallel job **error** filename (Usually empty) | | | [$JOB_NAME].pe[$JOB_ID] | Parallel job **error** filename (Usually empty) | |
| |
==== Command options for qsub ==== | ==== More options for qsub ==== |
| |
The most commonly used **qsub** options fall into two categories: //operational// and //resource-management//. The operational options deal with naming the output files, mail notification of the processing steps, sequencing of a series of jobs, and establishing the UNIX environment. The resource-management options deal with the specific system resources you desire or need, such as parallel programming environments, number of processor cores, maximum CPU time, and virtual memory needed. | The most commonly used **qsub** options fall into two categories: //operational// and //resource-management//. The operational options deal with naming the output files, mail notification of the processing steps, sequencing of a series of jobs, and establishing the UNIX environment. The resource-management options deal with the specific system resources you desire or need, such as parallel programming environments, number of processor cores, maximum CPU time, and virtual memory needed. |
For memory you will be concerned about how much is free. Memory resources come as both consumable and sensor driven (not consumable). For example: | For memory you will be concerned about how much is free. Memory resources come as both consumable and sensor driven (not consumable). For example: |
^ memory resource ^ Consumable ^ Explanation ^ | ^ memory resource ^ Consumable ^ Explanation ^ |
| mem_free | No |Memory that must be available BEFORE job can start | | | m_mem_free | Yes |Memory consumed per CPU DURING execution | |
| m_mem_free | Yes |Memory consumed by the job DURING execution | | |
| |
It is usually a good idea to add both resources. The ''mem_free'' complex is sensor driven, and is more reliable for choosing a node for your job. The ''m_mem_free'' is consumable, which means you are reserving the memory for future use. Other jobs, using ''m_mem_free'', may be barred from starting on the node. If you are specifying memory resources for a parallel environment job, the requested memory is multiplied by the slot count. By default, ''m_mem_free'' is defined as 1GB of memory per core (slot), if not specified. | The ''m_mem_free'' is consumable, which means you are reserving the memory for future use. Other jobs, using ''m_mem_free'', may be barred from starting on the node. If you are specifying memory resources for a parallel environment job, the requested memory is multiplied by the slot count. By default, ''m_mem_free'' is defined as 1GB of memory per core (slot), if not specified. |
| |
<note tip>When using a shared memory parallel computing environment ''-pe threads'', divide the total memory needed by the number of slots. For example, to request 48G of shared memory for an 8 thread job, request 6G (6G per slot).</note> | <note tip>When using a shared memory parallel computing environment ''-pe threads'', divide the total memory needed by the number of slots. For example, to request 48G of shared memory for an 8 thread job, request 6G (6G per slot) i.e.,'-l m_mem_free=6G'</note> |
| |
<note warning>Please note a job error will occur and prevent the queue from accepting jobs when: | <note warning>Please note a job error will occur and prevent the queue from accepting jobs when: |
| |
<code> | <code> |
qsub -l mem_free=20G,m_mem_free=20G -t 1-30 myjob.qs | qsub -l m_mem_free=20G -t 1-30 myjob.qs |
</code> | </code> |
| |
This will submit 30 jobs to the queue, with the SGE_TASK_ID variable set for use in the ''myjobs.qs'' script (an [[general:userguide:06_runtime_environ?&#array-jobs|array job]].) | This will submit 30 jobs to the queue, with the SGE_TASK_ID variable set for use in the ''myjobs.qs'' script (an [[:abstract:farber:runjobs:schedule_jobs#array-jobs|array job]].) |
The ''mem_free'' resource will cause Grid Engine to find a node (or wait until one is available) with 20 Gbytes of memory free. | The ''m_mem_free'' resource will tell Grid Engine to not schedule a job on a node unless the specified amount of memory i.e., 20GB per CPU is available to consume on that node. Since this is a serial job that runs on a single CPU,20GB can be termed as total memory available for the job. |
The ''m_mem_free'' resource will tell Grid Engine to not schedule a job on a node unless the specified amount of memory is available to consume on that node. | |
==== Parallel environments ==== | ==== Parallel environments ==== |
| |
The ''/opt/templates/gridengine'' directory contains basic prototype job scripts for non-interactive parallel jobs. This section describes the **–pe** parallel environment option that's required for MPI jobs, openMP jobs and other jobs that use the SMP (threads) programming model. | The ''/opt/shared/templates/gridengine'' directory contains basic prototype job scripts for non-interactive parallel jobs. This section describes the **–pe** parallel environment option that's required for MPI jobs, openMP jobs and other jobs that use the SMP (threads) programming model. |
| |
Type the command: | Type the command: |
=== The threads parallel environment === | === The threads parallel environment === |
| |
Jobs such as those having openMP directives use the **//threads//** parallel environment, an implementation of the [[:general:userguide:05_development#programming-models|shared-memory programming model]]. These SMP jobs can only use the cores on a **single** node. | Jobs such as those having openMP directives use the **//threads//** parallel environment, an implementation of the [[:abstract:farber:app_dev:prog_env#programming-models|shared-memory programming model]]. These SMP jobs can only use the cores on a **single** node. |
| |
For example, if your group only owns nodes with 24 cores, then your ''–pe threads'' request may only ask for 24 or fewer slots. Use Grid Engine's **qconf** command to determine the names and characteristics of the queues and compute nodes available to your investing-entity group on a cluster. | For example, if your group only owns nodes with 24 cores, then your ''–pe threads'' request may only ask for 24 or fewer slots. Use Grid Engine's **qconf** command to determine the names and characteristics of the queues and compute nodes available to your investing-entity group on a cluster. |
| |
<note tip> | <note tip> |
IT provides a job script template called ''openmp.qs'' available in ''/opt/templates/gridengine/openmp'' to copy and customize for your OpenMP jobs. | IT provides a job script template called ''openmp.qs'' available in ''/opt/shared/templates/gridengine/openmp'' to copy and customize for your OpenMP jobs. |
</note> | </note> |
| |
MPI jobs inherently generate considerable network traffic among the processor cores of a cluster's compute nodes. The processors on the compute node may be connected by two types of networks: InfiniBand and Gigabit Ethernet. | MPI jobs inherently generate considerable network traffic among the processor cores of a cluster's compute nodes. The processors on the compute node may be connected by two types of networks: InfiniBand and Gigabit Ethernet. |
| |
IT has developed templates to help with the **openmpi** parallel environments for Farber, targeting different user needs and architecture. You can copy the templates from ''/opt/templates/gridengine/openmpi'' and customize them. These templates are essentially identical with the exception of the presence or absence of certain **qsub** options and the values assigned to **MPI_FLAGS** based on using particular environment variables. In all cases, the parallel environment option must be specified: | IT has developed templates to help with the **openmpi** parallel environments for Farber, targeting different user needs and architecture. You can copy the templates from ''/opt/shared/templates/gridengine/openmpi'' and customize them. These templates are essentially identical with the exception of the presence or absence of certain **qsub** options and the values assigned to **MPI_FLAGS** based on using particular environment variables. In all cases, the parallel environment option must be specified: |
| |
''-pe mpi'' <<//NPROC//>> | ''-pe mpi'' <<//NPROC//>> |
| |
<note tip> | <note tip> |
IT provides several job script templates in ''/opt/templates/gridengine/openmpi'' to copy and customize for your Open MPI jobs. See [[software:openmpi:farber|Open MPI on Farber]] for more details about these job scripts. | IT provides several job script templates in ''/opt/shared/templates/gridengine/openmpi'' to copy and customize for your Open MPI jobs. See [[software:openmpi:farber|Open MPI on Farber]] for more details about these job scripts. |
</note> | </note> |
| |
==== Job Templates ==== | ==== Job Templates ==== |
| |
Detailed information pertaining to individual kinds of parallel jobs -- like setting the ''OMP_NUM_THREADS'' environment variable to ''$NSLOTS'' for OpenMP programs -- are provided by UD IT in a collection of job template scripts on a per-cluster basis under the ''/opt/templates'' directory. For example, on farber this directory looks like: | Detailed information pertaining to individual kinds of parallel jobs -- like setting the ''OMP_NUM_THREADS'' environment variable to ''$NSLOTS'' for OpenMP programs -- are provided by UD IT in a collection of job template scripts on a per-cluster basis under the ''/opt/shared/templates'' directory. For example, on farber this directory looks like: |
| |
<code bash> | <code bash> |
[(it_css:traine)@farber ~]$ ls -l /opt/templates | [(it_css:traine)@farber ~]$ ls -l /opt/shared/templates |
total 4 | total 4 |
drwxr-sr-x 7 frey _sgeadm 104 Jul 17 08:11 dev-projects | drwxr-sr-x 7 frey _sgeadm 104 Jul 17 08:11 dev-projects |
==== Array jobs ==== | ==== Array jobs ==== |
| |
An [[:general:jobsched:grid-engine:35_parallelism#array-jobs|array job]] essentially runs the same job by generating a new repeated task many times. Each time, the environment variable **SGE_TASK_ID** is set to a sequence number by Grid Engine and its value provides input to the job submission script. | An array job essentially runs the same job by generating a new repeated task many times. Each time, the environment variable **SGE_TASK_ID** is set to a sequence number by Grid Engine and its value provides input to the job submission script. |
| |
<note tip> | <note tip> |