The job queues (partitions) on Caviness

The Caviness cluster has several kinds of partition (queue) available in which to run jobs:

standardThe default partition if no --partition submission flag is specified; jobs can be preempted (killed)scontrol show partition standard
develA partition with very short runtime limits and small resource limits; important to use for any development using compilersscontrol show partition devel
workgroup-specificPartitions associated with specific kinds of compute equipment in the cluster purchased by a research group «investing-entity» (workgroup)scontrol show partition«workgroup»

This partition is the default when no --partition submission flag is specified. Also, anyone on the Caviness can request resources from the standard partition. However, job preemption logic (discussed below) is implemented on this partition to ensure workgroup-specific jobs are prioritized.

The idea of the standard partition is somewhat like the combination of the standby and spillover queues concepts in the earlier clusters.

Limits to jobs submitted to this partition are:

  • a maximum runtime of 7 days (default is 30 minutes)
  • Maximum number of CPUs per job = 360
  • Maximum CPUs per user = 720

The standard partition is subject to job preemption (killed) because it allows a job submitted to a workgroup-specific partition to release resources tied-up by jobs in the standard partition. In summary, jobs in the standard partition will be preempted (killed with 5 minute grace period) to release resources for the workgroup-specific partition job. For more information on how to handle your job if it is preempted, please refer to Checkpointing.

This partition is used for short-lived jobs with minimal resource needs. Typical uses for the devel queue include:

  • Performing compiles of code for projects that otherwise can't be done on the login (head) node and to make sure you are allocated a compute node with the development tools, libraries, etc. which are needed for compilers.
  • Running test jobs to vet programs or changes to programs
  • Testing correctness of program parallelization
  • Interactive sessions
  • Removing files especially if cleaning up many files and directories in $HOME, $WORKDIR and /lustre/scratch

Because performance is not critical for these use cases, the nodes serviced by the devel partition have hyperthreads enabled, effectively doubling the number of CPUs available.

Limits to jobs submitted to this partition are:

  • a maximum runtime of 2 hours (default is 30 minutes)
  • each user can submit up to 2 jobs
  • each job can use up to 4 cores on a single node

For example:

[traine@login01 ~]$ workgroup -g it_css
[(it_css:traine)@login00 ~]$ srun --partition=devel --nodes=1 --ntasks=1 --cpus-per-task=4 date
Mon Jul 23 15:25:07 EDT 2018

One copy of the date command is executed on one node in the devel partition; the command has four cores (or in this case, hyperthreads) allocated to it. An interactive shell in the devel partition with two cores and one hour of time available would be started via:

[traine@login01 ~]$ workgroup -g it_css
[(it_css:traine)@login01 ~]$ salloc --partition=devel --cpus-per-task=2 --time=1:0:0
salloc: Granted job allocation 940
salloc: Waiting for resource configuration
salloc: Nodes r00n56 are ready for job
[traine@r00n56 ~]$ echo $SLURM_CPUS_ON_NODE 

The use of investing-entity (workgroup) partitions (queues), are similar to the owner queues on Mills and Farber, however on Caviness distinct nodes will not be assigned to a workgroup-specific partition. Instead priority-access will be given to the investing-entity (workgroup) to span all of the cluster resources for each type of node purchased by the workgroup on Caviness. Each workgroup-specific partition will reuse the existing workgroup QOS as its default (baseline) QOS to limit the resources and at the same time guarantee access based on what was purchased by preempting (killing) jobs in the standard queue to make way for jobs submitted to the workgroup-specific queues. There is a special flag to check for the presence of the name _workgroup_ in the list of requested partitions for the job. If enabled, the word _workgroup_ is replaced with the investing-entity (workgroup) name under which the job was submitted by the user (e.g. workgroup -g «investing-entity»)

Limits to jobs submitted to workgroup-specific partitions:

  • a maximum runtime of 7 days (default is 30 minutes)
  • per-workgroup resource limits (QOS) based on
    • how many nodes your research group (workgroup) purchased (node=#)
    • how many cores your research group (workgroup) purchased (cpu=#)
    • how many GPUs your research group (workgroup) purchased (gres/gpu:<kind>=#)

For example:

$ workgroup -g it_nss
$ sbatch --verbose --partition=_workgroup_ …
sbatch: partition         : _workgroup_
Submitted batch job 1234
$ scontrol show job 1234 | egrep -i '(partition|account)='
   Priority=2014 Nice=0 Account=it_nss QOS=normal
   Partition=it_nss AllocNode:Sid=login01:7280

Job 1234 is billed against the it_nss account because it is in the it_nss workgroup partition. When the job executes, all processes start with the it_nss Unix group.

To check what your workgroup has access to and the guaranteed resources on the Caviness refer to Resources.

  • abstract/caviness/runjobs/queues.txt
  • Last modified: 2023-05-30 13:48
  • by anita