abstract:caviness:runjobs:schedule_jobs

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
abstract:caviness:runjobs:schedule_jobs [2024-01-24 15:56] – [GPU nodes] anitaabstract:caviness:runjobs:schedule_jobs [2024-01-30 17:18] (current) – [GPU nodes] anita
Line 288: Line 288:
 ==== GPU nodes ==== ==== GPU nodes ====
  
-After entering into the workgroup, GPU nodes can be requested through an interactive session using ''salloc'' or through batch submission using ''sbatch''. An appropriate partition name (such as a workgroup for running or ''devel'' if you need to compile on a GPU node) has to be mentioned while running the command as below.+After entering into the workgroup, GPU nodes can be requested through an interactive session using ''salloc'' or through batch submission using ''sbatch''. An appropriate partition name (such as a workgroup for running or ''devel'' if you need to compile on a GPU node) and a GPU resource and type **must** be specified while running the command as below.
  
 <code bash> <code bash>
Line 315: Line 315:
 </code> </code>
  
-Also if your workgroup has purchased more than one kind of GPU node and you want to target a specific GPU node type, then you can use ''%%--%%gres=gpu:p100'' or ''%%--%%gres=gpu:v100'' or ''%%--%%gres=gpu:t4'' See [[abstract:caviness:runjobs:job_status#sworkgroup|sworkgroup]] to determine your workgroup resources including GPU node type. In the example below, this particular workgroup has (2) ''gpu:p100'' and (2) ''gpu:v100'' types of GPUs available+Also if your workgroup has purchased more than one kind of GPU node, then you need to choose that specific GPU type to target itsuch as ''%%--%%gres=gpu:p100'' or ''%%--%%gres=gpu:v100'' or ''%%--%%gres=gpu:t4'' or ''%%--%%gres=gpu:a100'' to by default get 1 GPU or the form ''%%--%%gres=gpu:<<GPU type>>:<<#>'' See [[abstract:caviness:runjobs:job_status#sworkgroup|sworkgroup]] to determine your workgroup resources including GPU node type. In the example below, this particular workgroup has (2) ''gpu:p100''(2) ''gpu:v100'' and (2) ''gpu:a100'' types of GPUs available
  
 <code bash> <code bash>
 [traine@login00 ~]$ sworkgroup -g ececis_research --limits [traine@login00 ~]$ sworkgroup -g ececis_research --limits
 Partition       Per user Per job Per workgroup Partition       Per user Per job Per workgroup
----------------+--------+-------+------------------------------------------------- +---------------+--------+-------+----------------------------------------------------------------- 
-devel           jobs   cpu=4 +devel           jobs   cpu=4 
-ececis_research                  cpu=152,mem=1882G,gres/gpu:p100=2,gres/gpu:v100=2+ececis_research                  cpu=248,mem=3075G,gres/gpu:p100=2,gres/gpu:v100=2,gres/gpu:a100=2
 reserved reserved
 standard        cpu=720  cpu=360 standard        cpu=720  cpu=360
 </code> </code>
  
-Any user can employ a GPU by running in the ''standard'' partition, however keep in mind jobs can be preempted and would require [[abstract:caviness:runjobs:schedule_jobs#handling-system-signals-aka-checkpointing|checkpointing]] as part of your batch job script.  The interactive session example below requests any node with a GPU v100, 1 core and 1 GB of memory (default values if not specified) on the standard partition.+Any user can employ a GPU by running in the ''standard'' partition, however keep in mind a GPU type **must** be specified, jobs can be preempted and would require [[abstract:caviness:runjobs:schedule_jobs#handling-system-signals-aka-checkpointing|checkpointing]] as part of your batch job script.  The interactive session example below requests any node with (2) GPUs v100 type, 1 core1 GB of memory and 30 minutes of time (default values if not specified) on the ''standard'' partition.
  
 <code bash> <code bash>
-salloc --partition=standard --gres=gpu:v100+salloc --partition=standard --gres=gpu:v100:2
 </code> </code>
  
-to allocate any type of GPU node available for your interactive job in the standard partition. +If you are unsure of the GPU types and counts available in the ''standard'' partition, see [[abstract:caviness:caviness#compute-nodes|Compute Nodes]] on Caviness.
 ==== Enhanced Local Scratch nodes ==== ==== Enhanced Local Scratch nodes ====
  
  • abstract/caviness/runjobs/schedule_jobs.1706129818.txt.gz
  • Last modified: 2024-01-24 15:56
  • by anita