The job queues (partitions) on Caviness
The Caviness cluster has several kinds of partition (queue) available in which to run jobs:
Kind | Description | Nodes |
---|---|---|
standard | The default partition if no --partition submission flag is specified; jobs can be preempted (killed) | scontrol show partition standard |
devel | A partition with very short runtime limits and small resource limits; important to use for any development using compilers | scontrol show partition devel |
workgroup-specific | Partitions associated with specific kinds of compute equipment in the cluster purchased by a research group «investing-entity» (workgroup) | scontrol show partition «workgroup» |
The standard partition
This partition is the default when no --partition
submission flag is specified. Also, anyone on the Caviness can request resources from the standard partition. However, job preemption logic (discussed below) is implemented on this partition to ensure workgroup-specific jobs are prioritized.
The idea of the standard partition is somewhat like the combination of the standby and spillover queues concepts in the earlier clusters.
Limits to jobs submitted to this partition are:
- a maximum runtime of 7 days (default is 30 minutes)
- Maximum number of CPUs per job = 360
- Maximum CPUs per user = 720
The standard partition is subject to job preemption (killed) because it allows a job submitted to a workgroup-specific partition to release resources tied-up by jobs in the standard partition. In summary, jobs in the standard partition will be preempted (killed with 5 minute grace period) to release resources for the workgroup-specific partition job. For more information on how to handle your job if it is preempted, please refer to Checkpointing.
The devel partition
This partition is used for short-lived jobs with minimal resource needs. Typical uses for the devel
queue include:
- Performing compiles of code for projects that otherwise can't be done on the login (head) node and to make sure you are allocated a compute node with the development tools, libraries, etc. which are needed for compilers.
- Running test jobs to vet programs or changes to programs
- Testing correctness of program parallelization
- Interactive sessions
- Removing files especially if cleaning up many files and directories in
$HOME
,$WORKDIR
and/lustre/scratch
Because performance is not critical for these use cases, the nodes serviced by the devel
partition have hyperthreads enabled, effectively doubling the number of CPUs available.
Limits to jobs submitted to this partition are:
- a maximum runtime of 2 hours (default is 30 minutes)
- each user can submit up to 2 jobs
- each job can use up to 4 cores on a single node
For example:
[traine@login01 ~]$ workgroup -g it_css [(it_css:traine)@login00 ~]$ srun --partition=devel --nodes=1 --ntasks=1 --cpus-per-task=4 date Mon Jul 23 15:25:07 EDT 2018
One copy of the date
command is executed on one node in the devel
partition; the command has four cores (or in this case, hyperthreads) allocated to it. An interactive shell in the devel
partition with two cores and one hour of time available would be started via:
[traine@login01 ~]$ workgroup -g it_css [(it_css:traine)@login01 ~]$ salloc --partition=devel --cpus-per-task=2 --time=1:0:0 salloc: Granted job allocation 940 salloc: Waiting for resource configuration salloc: Nodes r00n56 are ready for job [traine@r00n56 ~]$ echo $SLURM_CPUS_ON_NODE 2
The workgroup-specific partitions
The use of investing-entity (workgroup) partitions (queues), are similar to the owner queues on Mills and Farber, however on Caviness distinct nodes will not be assigned to a workgroup-specific partition. Instead priority-access will be given to the investing-entity (workgroup) to span all of the cluster resources for each type of node purchased by the workgroup on Caviness. Each workgroup-specific partition will reuse the existing workgroup QOS as its default (baseline) QOS to limit the resources and at the same time guarantee access based on what was purchased by preempting (killing) jobs in the standard queue to make way for jobs submitted to the workgroup-specific queues. There is a special flag to check for the presence of the name _workgroup_
in the list of requested partitions for the job. If enabled, the word _workgroup_
is replaced with the investing-entity (workgroup) name under which the job was submitted by the user (e.g. workgroup -g «investing-entity»
)
Limits to jobs submitted to workgroup-specific partitions:
- a maximum runtime of 7 days (default is 30 minutes)
- per-workgroup resource limits (QOS) based on
- how many nodes your research group (workgroup) purchased (node=#)
- how many cores your research group (workgroup) purchased (cpu=#)
- how many GPUs your research group (workgroup) purchased (gres/gpu:<kind>=#)
For example:
$ workgroup -g it_nss $ sbatch --verbose --partition=_workgroup_ … : sbatch: partition : _workgroup_ : Submitted batch job 1234 $ scontrol show job 1234 | egrep -i '(partition|account)=' Priority=2014 Nice=0 Account=it_nss QOS=normal Partition=it_nss AllocNode:Sid=login01:7280
Job 1234 is billed against the it_nss account because it is in the it_nss workgroup partition. When the job executes, all processes start with the it_nss Unix group.
To check what your workgroup has access to and the guaranteed resources on the Caviness refer to Resources.