abstract:caviness:runjobs:queues

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
abstract:caviness:runjobs:queues [2020-04-22 16:53] anitaabstract:caviness:runjobs:queues [2023-05-30 13:48] (current) – [The job queues (partitions) on Caviness] anita
Line 4: Line 4:
 The Caviness cluster has several kinds of partition (queue) available in which to run jobs: The Caviness cluster has several kinds of partition (queue) available in which to run jobs:
  
-^Kind^Description^ +^Kind^Description^Nodes
-|standard|The default partition if no ''%%--%%partition'' submission flag is specified; jobs can be preempted (killed)| +|standard|The default partition if no ''%%--%%partition'' submission flag is specified; jobs can be preempted (killed)|''scontrol show partition standard''
-|devel|A partition with very short runtime limits and small resource limits; important to use for any development using compilers| +|devel|A partition with very short runtime limits and small resource limits; important to use for any development using compilers|''scontrol show partition devel''
-|workgroup-specific|Partitions associated with specific kinds of compute equipment in the cluster purchased by a research group <<//investing-entity//>> (workgroup)|+|workgroup-specific|Partitions associated with specific kinds of compute equipment in the cluster purchased by a research group <<//investing-entity//>> (workgroup)|''scontrol show partition''<<//workgroup//>>|
  
 ===== The standard partition ===== ===== The standard partition =====
Line 29: Line 29:
   * Testing correctness of program parallelization   * Testing correctness of program parallelization
   * Interactive sessions   * Interactive sessions
 +  * Removing files especially if cleaning up many files and directories in ''$HOME'', ''$WORKDIR'' and ''/lustre/scratch''
 Because performance is not critical for these use cases, the nodes serviced by the ''devel'' partition have hyperthreads enabled, effectively doubling the number of CPUs available. Because performance is not critical for these use cases, the nodes serviced by the ''devel'' partition have hyperthreads enabled, effectively doubling the number of CPUs available.
  
Line 38: Line 39:
 For example: For example:
 <code bash> <code bash>
 +[traine@login01 ~]$ workgroup -g it_css
 [(it_css:traine)@login00 ~]$ srun --partition=devel --nodes=1 --ntasks=1 --cpus-per-task=4 date [(it_css:traine)@login00 ~]$ srun --partition=devel --nodes=1 --ntasks=1 --cpus-per-task=4 date
 Mon Jul 23 15:25:07 EDT 2018 Mon Jul 23 15:25:07 EDT 2018
 </code> </code>
  
-One copy of the ''date'' command is executed on one node in the ''devel'' partition; the command has four cores (or in this case, hyperthreads) allocated to it.  An interactive shell in the ''devel'' partition with two cores available would be started via:+One copy of the ''date'' command is executed on one node in the ''devel'' partition; the command has four cores (or in this case, hyperthreads) allocated to it.  An interactive shell in the ''devel'' partition with two cores and one hour of time available would be started via:
 <code bash> <code bash>
 [traine@login01 ~]$ workgroup -g it_css [traine@login01 ~]$ workgroup -g it_css
-[(it_css:traine)@login01 ~]$ salloc --partition=devel --cpus-per-task=2+[(it_css:traine)@login01 ~]$ salloc --partition=devel --cpus-per-task=2 --time=1:0:0
 salloc: Granted job allocation 940 salloc: Granted job allocation 940
 salloc: Waiting for resource configuration salloc: Waiting for resource configuration
Line 68: Line 70:
 <code bash> <code bash>
 $ workgroup -g it_nss $ workgroup -g it_nss
-$ sbatch --verbose --account=it_css --partition=_workgroup_ …+$ sbatch --verbose --partition=_workgroup_ …
   :   :
 sbatch: partition         : _workgroup_ sbatch: partition         : _workgroup_
Line 74: Line 76:
 Submitted batch job 1234 Submitted batch job 1234
 $ scontrol show job 1234 | egrep -i '(partition|account)=' $ scontrol show job 1234 | egrep -i '(partition|account)='
-   Priority=2014 Nice=0 Account=it_css QOS=normal+   Priority=2014 Nice=0 Account=it_nss QOS=normal
    Partition=it_nss AllocNode:Sid=login01:7280    Partition=it_nss AllocNode:Sid=login01:7280
 </code> </code>
  
-Job 1234 is billed against the it_css account but executes in the it_nss workgroup partition (assuming the it_css account has been granted access to that partition).  When the job executes, all processes start with the it_nss Unix group.+Job 1234 is billed against the it_nss account because it is in the it_nss workgroup partition.  When the job executes, all processes start with the it_nss Unix group.
  
 To check what your workgroup has access to and the guaranteed resources on the Caviness refer to [[abstract:caviness:runjobs:job_status#Available-Resources|Resources]]. To check what your workgroup has access to and the guaranteed resources on the Caviness refer to [[abstract:caviness:runjobs:job_status#Available-Resources|Resources]].
  
  • abstract/caviness/runjobs/queues.1587588808.txt.gz
  • Last modified: 2020-04-22 16:53
  • by anita