Differences
This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revisionLast revisionBoth sides next revision | ||
abstract:farber:runjobs:queues [2018-05-22 08:11] – [The standby queues] sraskar | abstract:farber:runjobs:queues [2018-10-08 16:00] – anita | ||
---|---|---|---|
Line 1: | Line 1: | ||
- | ===== The job queues on Farber ===== | + | ====== The job queues on Farber |
Each investing-entity on a cluster an //owner queue// that exclusively use the investing-entity' | Each investing-entity on a cluster an //owner queue// that exclusively use the investing-entity' | ||
Line 19: | Line 19: | ||
^ <<// | ^ <<// | ||
^ '' | ^ '' | ||
- | ^ ::: | You must specify **–l standby=1** as a **qsub** option. You must also use the **-notify** option if your jobs traps the USR2 termination signal. | + | ^ ::: | You must specify **–l standby=1** as a **qsub** option. You must also use the **-notify** option if your jobs traps the USR2 termination signal.| |
^ '' | ^ '' | ||
- | ^ ::: | You must specify **–l standby=1** as a **qsub** option. And, if more than 200 slots are requested, you must also specify a maximum run-time of 4 hours or less via the **-l h_rt=// | + | ^ ::: | You must specify **–l standby=1** as a **qsub** option. And, if more than 200 slots are requested, you must also specify a maximum run-time of 4 hours or less via the **-l h_rt=// |
^ '' | ^ '' | ||
Line 28: | Line 28: | ||
</ | </ | ||
+ | ===== Farber " | ||
+ | |||
+ | Farber has two standby queues, '' | ||
+ | |||
+ | The " | ||
+ | |||
+ | Grid Engine preferentially allocates standby slots on nodes that are lightly loaded. It assigns these jobs a lower queue-priority than jobs submitted by members of the group owning the node. Consequently, | ||
+ | |||
+ | Specify the '' | ||
+ | |||
+ | <code text> | ||
+ | qsub -l standby=1 ... | ||
+ | </ | ||
+ | |||
+ | ==== Grid Engine resources governing these queues ==== | ||
- | ====Grid Engine resources governing these queues==== | ||
The " | The " | ||
+ | |||
+ | The differences between the two queues is tied to the number of slots and the maximum (wall-clock) //hard// run-time you specify. | ||
+ | |||
+ | * If you specify a maximum run-time of 4 hours or less (e.g.,'' | ||
+ | |||
+ | * If you do **not** specify a maximum run-time **or** if you specify a run-time greater than 4 hours but not exceeding 8 hours, then you may request up to 200 slots for any job. The job will be assigned to '' | ||
+ | |||
+ | The total number of concurrent slots in the '' | ||
+ | |||
+ | For example, you could concurrently run 25 20-slot jobs (500 slots). This would leave 300 slots available for any other concurrent standby jobs you may submit. | ||
+ | |||
+ | Job script example: | ||
+ | <code bash> | ||
+ | # | ||
+ | # The standby flag asks to run the job in a standby queue. | ||
+ | #$ -l standby=1 | ||
+ | # | ||
+ | # This job needs an openmpi parallel environment using 500 slots. | ||
+ | #$ -pe openmpi 500 | ||
+ | # | ||
+ | # The h_rt flag specifies a 4-hr maximum (hard) run-time limit. | ||
+ | # The flag is required because the job needs more than 240 slots. | ||
+ | #$ -l h_rt=4: | ||
+ | ... | ||
+ | </ | ||
==== Mapping jobs to nodes ==== | ==== Mapping jobs to nodes ==== | ||
+ | Once Grid Engine determines the appropriate standby queue, it maps the job to available, idle, nodes (hosts) to fill all the slots. For openmpi jobs, Grid Engine is configured to use the //fill up// allocation rule, by default. | ||
+ | |||
+ | It may be useful to control the number of nodes and the number of processes per node. | ||
+ | For example: | ||
+ | qsub -l standby, | ||
+ | The MPI processes (ranks) will be mapped to **'' | ||
+ | The MPI_FLAG '' | ||
+ | |||
+ | <note important> | ||
+ | If these jobs are assigned to your nodes (you have them for up to 8 hours), they will compete for shared resources: | ||
+ | ^ Resource ^ Shared ^ | ||
+ | | cores | 20 | | ||
+ | | memory |64 GBs | | ||
+ | If you add the '' | ||
+ | #$ -l exclusive=1 | ||
+ | then Grid Engine will round up your slot request to a multiple of 20 and thus keep other jobs off the node. | ||
+ | </ | ||
+ | |||
+ | <note tip>The allocation rule and the group names are configured in Grid Engine. | ||
+ | the current configuration. | ||
+ | |||
+ | To see the current allocation rule for '' | ||
+ | |||
+ | $ qconf -sp mpi | grep allocation_rule | ||
+ | | ||
+ | |||
+ | To see a list of all group names: | ||
+ | |||
+ | $ qconf -shgrpl | ||
+ | @128G | ||
+ | @256G | ||
+ | | ||
+ | |||
+ | To see the nodes in a group name: | ||
+ | $ qconf -shgrp @128G | ||
+ | | ||
+ | | ||
+ | </ | ||
+ | |||
+ | ===== Farber Exclusive access ===== | ||
+ | |||
+ | If a job is submitted with the '' | ||
+ | |||
+ | * promote any serial jobs to 20-core threaded (-pe threads 20) | ||
+ | * modify any parallel jobs to round-up the slot count to the nearest multiple of 20 | ||
+ | * ignore any memory resources and make all memory available on all nodes assigned to the job | ||
+ | |||
+ | A job running on a node with '' | ||
+ | |||
+ | Job script example: | ||
+ | <code bash> | ||
+ | # | ||
+ | # The exclusive flag asks to run this job only on all nodes required to fulfill requested slots | ||
+ | #$ -l exclusive=1 | ||
+ | # | ||
+ | # This job needs an openmpi parallel environment using 32 slots = 2 nodes exclusively. | ||
+ | #$ -pe openmpi 32 | ||
+ | # | ||
+ | # By default the slot count granted by Grid Engine will be | ||
+ | # used, one MPI worker per slot. Set this variable if you | ||
+ | # want to use fewer cores than Grid Engine granted you (e.g. | ||
+ | # when using exclusive=1): | ||
+ | # | ||
+ | # | ||
+ | |||
+ | ... | ||
+ | </ | ||
+ | |||
+ | <note tip>In the script example, this job would be rounded up to 40 and would be assigned 2 nodes. If you really want your job to run with only 32 slots, uncomment and set '' | ||
+ | |||
+ | Grid Engine is configured to "fill up" nodes by allocating as many slots as possible before proceeding to another node to fulfill the total number of requested slots for the job. Unfortunately, | ||
+ | |||
+ | To assure that your job will be the only job running on a node (or all nodes needed to satisfy the slots requested), specify the '' | ||
+ | |||
+ | <code text> | ||
+ | qsub -l exclusive=1 ... | ||
+ | </ | ||
+ | |||
+ | |||
+ | If a job is submitted with the general resource, Grid Engine will | ||
+ | |||
+ | * promote any serial jobs to 20-core threaded (-pe threads 20) | ||
+ | * modify any parallel jobs to round-up the slot count to the nearest multiple of 20 | ||
+ | * ignore any memory resources and make all memory available on all nodes assigned to the job | ||
+ | |||
+ | A job running on a node with '' | ||
+ | |||
+ | Job script example: | ||
+ | <code bash> | ||
+ | # | ||
+ | # The exclusive flag asks to run this job only on all nodes required to fulfill requested slots | ||
+ | #$ -l exclusive=1 | ||
+ | # | ||
+ | # This job needs an openmpi parallel environment using 32 slots = 2 nodes exclusively. | ||
+ | #$ -pe openmpi 32 | ||
+ | # | ||
+ | # By default the slot count granted by Grid Engine will be | ||
+ | # used, one MPI worker per slot. Set this variable if you | ||
+ | # want to use fewer cores than Grid Engine granted you (e.g. | ||
+ | # when using exclusive=1): | ||
+ | # | ||
+ | # | ||
+ | |||
+ | ... | ||
+ | </ | ||
+ | |||
+ | <note tip>In the script example, this job would be rounded up to 40 and would be assigned 2 nodes. If you really want your job to run with only 32 slots, uncomment and set '' | ||
- | Once Grid Engine determines the appropriate standby queue, it maps the job to available, idle, nodes (hosts) to fill all the slots. For MPI jobs, Grid Engine is configured to use the //fill up// allocation rule, by default. | ||
- | ====Actions at the run-time limit==== | + | ==== Actions at the run-time limit ===== |
When a standby job reaches its maximum run time, Grid Engine kills the job. The process depends on your use of Grid Engine' | When a standby job reaches its maximum run time, Grid Engine kills the job. The process depends on your use of Grid Engine' | ||
Line 88: | Line 233: | ||
User = traine | User = traine | ||
Queue = standby.q@n015 | Queue = standby.q@n015 | ||
- | Host = n015.mills.hpc.udel.edu | + | Host = n015.farber.hpc.udel.edu |
Start Time = 06/01/2012 12:38:51 | Start Time = 06/01/2012 12:38:51 | ||
End Time = 06/01/2012 16:43:53 | End Time = 06/01/2012 16:43:53 | ||
Line 103: | Line 248: | ||
==== What if my program does not catch USR2? ==== | ==== What if my program does not catch USR2? ==== | ||
- | < | + | < |
When a program that does not handle the '' | When a program that does not handle the '' | ||
Line 167: | Line 312: | ||
The first '' | The first '' | ||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | ===== Farber " | ||
- | |||
- | Farber has two standby queues, '' | ||
- | |||
- | ====Grid Engine resources governing these queues==== | ||
- | The differences between the two queues is tied to the number of slots and the maximum (wall-clock) //hard// run-time you specify. | ||
- | |||
- | * If you specify a maximum run-time of 4 hours or less (e.g.,'' | ||
- | |||
- | * If you do **not** specify a maximum run-time **or** if you specify a run-time greater than 4 hours but not exceeding 8 hours, then you may request up to 200 slots for any job. The job will be assigned to '' | ||
- | |||
- | The total number of concurrent slots in the '' | ||
- | |||
- | For example, you could concurrently run 25 20-slot jobs (500 slots). This would leave 300 slots available for any other concurrent standby jobs you may submit. | ||
- | |||
- | Job script example: | ||
- | <code bash> | ||
- | # | ||
- | # The standby flag asks to run the job in a standby queue. | ||
- | #$ -l standby=1 | ||
- | # | ||
- | # This job needs an openmpi parallel environment using 500 slots. | ||
- | #$ -pe openmpi 500 | ||
- | # | ||
- | # The h_rt flag specifies a 4-hr maximum (hard) run-time limit. | ||
- | # The flag is required because the job needs more than 240 slots. | ||
- | #$ -l h_rt=4: | ||
- | ... | ||
- | </ | ||
- | |||
- | ==== Mapping jobs to nodes ==== | ||
- | Once Grid Engine determines the appropriate standby queue, it maps the job to available, idle, nodes (hosts) to fill all the slots. For openmpi jobs, Grid Engine is configured to use the //fill up// allocation rule, by default. | ||
- | |||
- | It may be useful to control the number of nodes and the number of processes per node. | ||
- | For example: | ||
- | qsub -l standby, | ||
- | The MPI processes (ranks) will be mapped to **'' | ||
- | The MPI_FLAG '' | ||
- | |||
- | <note important> | ||
- | If these jobs are assigned to your nodes (you have them for up to 8 hours), they will compete for shared resources: | ||
- | ^ Resource ^ Shared ^ | ||
- | | cores | 20 | | ||
- | | memory |64 GBs | | ||
- | If you add the '' | ||
- | #$ -l exclusive=1 | ||
- | then Grid Engine will round up your slot request to a multiple of 20 and thus keep other jobs off the node. | ||
- | </ | ||
- | |||
- | <note tip>The allocation rule and the group names are configured in Grid Engine. | ||
- | the current configuration. | ||
- | |||
- | To see the current allocation rule for '' | ||
- | |||
- | $ qconf -sp mpi | grep allocation_rule | ||
- | | ||
- | |||
- | To see a list of all group names: | ||
- | |||
- | $ qconf -shgrpl | ||
- | @128G | ||
- | @256G | ||
- | | ||
- | |||
- | To see the nodes in a group name: | ||
- | $ qconf -shgrp @128G | ||
- | | ||
- | | ||
- | </ | ||
- | |||
- | |||
- | |||
- | ====== Farber Exclusive access ====== | ||
- | |||
- | Grid Engine is configured to "fill up" nodes by allocating as many slots as possible before proceeding to another node to fulfill the total number of requested slots for the job. Unfortunately, | ||
- | |||
- | To assure that your job will be the only job running on a node (or all nodes needed to satisfy the slots requested), specify the '' | ||
- | |||
- | <code text> | ||
- | qsub -l exclusive=1 ... | ||
- | </ | ||
- | |||
- | |||
- | |||
- | If a job is submitted with the general resource, Grid Engine will | ||
- | |||
- | * promote any serial jobs to 20-core threaded (-pe threads 20) | ||
- | * modify any parallel jobs to round-up the slot count to the nearest multiple of 20 | ||
- | * ignore any memory resources and make all memory available on all nodes assigned to the job | ||
- | |||
- | A job running on a node with '' | ||
- | |||
- | Job script example: | ||
- | <code bash> | ||
- | # | ||
- | # The exclusive flag asks to run this job only on all nodes required to fulfill requested slots | ||
- | #$ -l exclusive=1 | ||
- | # | ||
- | # This job needs an openmpi parallel environment using 32 slots = 2 nodes exclusively. | ||
- | #$ -pe openmpi 32 | ||
- | # | ||
- | # By default the slot count granted by Grid Engine will be | ||
- | # used, one MPI worker per slot. Set this variable if you | ||
- | # want to use fewer cores than Grid Engine granted you (e.g. | ||
- | # when using exclusive=1): | ||
- | # | ||
- | # | ||
- | |||
- | ... | ||
- | </ | ||
- | |||
- | <note tip>In the script example, this job would be rounded up to 40 and would be assigned 2 nodes. If you really want your job to run with only 32 slots, uncomment and set '' | ||
- | |||