abstract:darwin:runjobs:queues

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
abstract:darwin:runjobs:queues [2021-08-26 18:04] pdwabstract:darwin:runjobs:queues [2023-07-10 08:51] (current) frey
Line 4: Line 4:
 The DARWIN cluster has several partitions (queues) available to specify when running jobs.  These partitions correspond to the various node types available in the cluster: The DARWIN cluster has several partitions (queues) available to specify when running jobs.  These partitions correspond to the various node types available in the cluster:
  
-^Partition Name^Description^ +^Partition Name^Description^Node Names
-|standard|Contains all 48 standard memory nodes (64 cores, 512 GiB memory per node)| +|standard|Contains all 48 standard memory nodes (64 cores, 512 GiB memory per node)|r1n00 - r1n47
-|large-mem|Contains all 32 large memory nodes (64 cores, 1024 GiB memory per node)| +|large-mem|Contains all 32 large memory nodes (64 cores, 1024 GiB memory per node)|r2l00 - r2l10
-|xlarge-mem|Contains all 11 extra-large memory nodes (64 cores, 2048 GiB memory per node)| +|xlarge-mem|Contains all 11 extra-large memory nodes (64 cores, 2048 GiB memory per node)|r2x00 - r2x10
-|extended-mem|Contains the single extended memory node (64 cores, 1024 GiB memory + 2.73 TiB NVMe swap)| +|extended-mem|Contains the single extended memory node (64 cores, 1024 GiB memory + 2.73 TiB NVMe swap)|r2e00
-|gpu-t4|Contains all 9 NVIDIA Tesla T4 GPU nodes (64 cores, 512 GiB memory, 1 T4 GPU per node)| +|gpu-t4|Contains all 9 NVIDIA Tesla T4 GPU nodes (64 cores, 512 GiB memory, 1 T4 GPU per node)|r1t00 - r1t07, r2t08
-|gpu-v100|Contains all 3 NVIDIA Tesla V100 GPU nodes (48 cores, 768 GiB memory, 4 V100 GPUs per node)| +|gpu-v100|Contains all 3 NVIDIA Tesla V100 GPU nodes (48 cores, 768 GiB memory, 4 V100 GPUs per node)|r2v00 - r2v02
-|gpu-mi50|Contains the single AMD Radeon Instinct MI50 GPU node (64 cores, 512 GiB memory, 1 MI50 GPU)| +|gpu-mi50|Contains the single AMD Radeon Instinct MI50 GPU node (64 cores, 512 GiB memory, 1 MI50 GPU)|r2m00| 
-|idle|Contains all nodes in the cluster, jobs on this partition can be preempted but are not charged against your allocation|+|gpu-mi100|Contains the single AMD Radeon Instinct MI100 GPU node (64 cores, 512 GiB memory, 1 MI100 GPU)|r2m01
 +|idle|Contains all nodes in the cluster, jobs on this partition can be preempted but are not charged against your allocation|
  
 ===== Requirements for all partitions ===== ===== Requirements for all partitions =====
Line 35: Line 36:
   * Maximum of 320 jobs per user   * Maximum of 320 jobs per user
   * Maximum of 640 CPUs per user (across all jobs in the partition)   * Maximum of 640 CPUs per user (across all jobs in the partition)
 +
 +==== Maximum Requestable Memory ====
 +
 +Each type of node (and thus, partition) has a limited amount of memory available for jobs.  A small amount of memory must be subtracted from the nominal size listed in the table above for the node's operating system and Slurm.  The remainder is the upper limit requestable by jobs, summarized by partition below:
 +
 +^Partition Name^Maximum (by node)^Maximum (by core)^
 +|standard|''--mem=499712M''|''--mem-per-cpu=7808M''|
 +|large-mem|''--mem=999424M''|''--mem-per-cpu=15616M''|
 +|xlarge-mem|''--mem=2031616M''|''--mem-per-cpu=31744M''|
 +|extended-mem|''--mem=999424M''|''--mem-per-cpu=15616M''|
 +|gpu-t4|''--mem=491520M''|''--mem-per-cpu=7680M''|
 +|gpu-v100|''--mem=737280M''|''--mem-per-cpu=15360M''|
 +|gpu-mi50|''--mem=491520M''|''--mem-per-cpu=7680M''|
 +|gpu-mi100|''--mem=491520M''|''--mem-per-cpu=7680M''|
  
 ===== The extended-mem partition ===== ===== The extended-mem partition =====
Line 57: Line 72:
  
 The ''idle'' partition contains all nodes in the cluster.  Jobs submitted to the ''idle'' partition **can be preempted** when the resources are required for jobs submitted to the other partitions.  Your job should support [[abstract:darwin:runjobs:schedule_jobs#handling-system-signals-aka-checkpointing|checkpointing]] to effectively use the ''idle'' partition and avoid lost work. The ''idle'' partition contains all nodes in the cluster.  Jobs submitted to the ''idle'' partition **can be preempted** when the resources are required for jobs submitted to the other partitions.  Your job should support [[abstract:darwin:runjobs:schedule_jobs#handling-system-signals-aka-checkpointing|checkpointing]] to effectively use the ''idle'' partition and avoid lost work.
 +
 +<note warning>Be aware that implementing checkpointing is highly dependent on the nature of your job and the ability of your code or software to handle interruptions and restarts.  For this reason, we can only provide limited support of the idle partition.</note>
  
 Jobs in the ''idle'' partition that have been running for less than 10 minutes are not considered for preemption by Slurm.  Additionally, there is a 5 minute grace period between the delivery of the initial preemption signal (SIGCONT+SIGTERM) and the end of the job (SIGCONT+SIGTERM+SIGKILL).  This means jobs in the ''idle'' partition will have a minimum of 15 minutes of execution time once started.  Jobs submitted using the ''--requeue'' flag automatically return to the queue to be rescheduled once resources are available again. Jobs in the ''idle'' partition that have been running for less than 10 minutes are not considered for preemption by Slurm.  Additionally, there is a 5 minute grace period between the delivery of the initial preemption signal (SIGCONT+SIGTERM) and the end of the job (SIGCONT+SIGTERM+SIGKILL).  This means jobs in the ''idle'' partition will have a minimum of 15 minutes of execution time once started.  Jobs submitted using the ''--requeue'' flag automatically return to the queue to be rescheduled once resources are available again.
  
-Jobs that execute in the ''idle'' partition do not result in charges against your allocation(s).  However, they do accumulate resource usage for the sake of scheduling priority to ensure fair access to this partition.  If your jobs can support [[abstract:darwin:runjobs:schedule_jobs#handling-system-signals-aka-checkpointing|checkpointing]], the ''idle'' partition will enable you to continue your research even if you exhaust your allocation(s).  Be aware that implementing checkpointing is highly dependent on the nature of your job and the ability of your code or software to handle interruptions and restarts.  For this reasonwe can only provide limited support of the idle partition.+Jobs that execute in the ''idle'' partition do not result in charges against your allocation(s).  However, they do accumulate resource usage for the sake of scheduling priority to ensure fair access to this partition.  If your jobs can support [[abstract:darwin:runjobs:schedule_jobs#handling-system-signals-aka-checkpointing|checkpointing]], the ''idle'' partition will enable you to continue your research even if you exhaust your allocation(s). 
 + 
 +==== Requesting a specific resource type in the idle partition ==== 
 + 
 +Since the ''idle'' partition contains all nodes in the cluster, you will need to request a specific GPU type if your job needs GPU resources.  The three GPU types are: 
 + 
 +^Type^Description^ 
 +|''tesla_t4''|NVIDIA Tesla T4| 
 +|''tesla_v100''|NVIDIA Tesla V100| 
 +|''amd_mi50''|AMD Radeon Instinct MI50| 
 + 
 +To request a specific GPU type while using the ''idle'' partition, include the ''%%--%%gpus=<type>:<count>'' flag with your job submission.  For example''%%--%%gpus=tesla_t4:4'' would request 4 NVIDIA Telsa T4 GPUs.
  • abstract/darwin/runjobs/queues.1630015461.txt.gz
  • Last modified: 2021-08-26 18:04
  • by pdw