abstract:caviness:runjobs:schedule_jobs

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Next revisionBoth sides next revision
abstract:caviness:runjobs:schedule_jobs [2024-01-24 15:56] – [GPU nodes] anitaabstract:caviness:runjobs:schedule_jobs [2024-01-30 17:13] – [GPU nodes] anita
Line 288: Line 288:
 ==== GPU nodes ==== ==== GPU nodes ====
  
-After entering into the workgroup, GPU nodes can be requested through an interactive session using ''salloc'' or through batch submission using ''sbatch''. An appropriate partition name (such as a workgroup for running or ''devel'' if you need to compile on a GPU node) has to be mentioned while running the command as below.+After entering into the workgroup, GPU nodes can be requested through an interactive session using ''salloc'' or through batch submission using ''sbatch''. An appropriate partition name (such as a workgroup for running or ''devel'' if you need to compile on a GPU node) and a GPU resource and type **must** be specified while running the command as below.
  
 <code bash> <code bash>
Line 315: Line 315:
 </code> </code>
  
-Also if your workgroup has purchased more than one kind of GPU node and you want to target a specific GPU node type, then you can use ''%%--%%gres=gpu:p100'' or ''%%--%%gres=gpu:v100'' or ''%%--%%gres=gpu:t4'' See [[abstract:caviness:runjobs:job_status#sworkgroup|sworkgroup]] to determine your workgroup resources including GPU node type. In the example below, this particular workgroup has (2) ''gpu:p100'' and (2) ''gpu:v100'' types of GPUs available+Also if your workgroup has purchased more than one kind of GPU node, then you need to choose that specific GPU type to target itsuch as ''%%--%%gres=gpu:p100'' or ''%%--%%gres=gpu:v100'' or ''%%--%%gres=gpu:t4'' or ''%%--%%gres=gpu:a100'' to by default get 1 GPU or the form ''%%--%%gres=gpu:<<GPU type>>:<<#>'' See [[abstract:caviness:runjobs:job_status#sworkgroup|sworkgroup]] to determine your workgroup resources including GPU node type. In the example below, this particular workgroup has (2) ''gpu:p100''(2) ''gpu:v100'' and (2) ''gpu:a100'' types of GPUs available
  
 <code bash> <code bash>
 [traine@login00 ~]$ sworkgroup -g ececis_research --limits [traine@login00 ~]$ sworkgroup -g ececis_research --limits
 Partition       Per user Per job Per workgroup Partition       Per user Per job Per workgroup
----------------+--------+-------+------------------------------------------------- +---------------+--------+-------+----------------------------------------------------------------- 
-devel           jobs   cpu=4 +devel           jobs   cpu=4 
-ececis_research                  cpu=152,mem=1882G,gres/gpu:p100=2,gres/gpu:v100=2+ececis_research                  cpu=248,mem=3075G,gres/gpu:p100=2,gres/gpu:v100=2,gres/gpu:a100=2
 reserved reserved
 standard        cpu=720  cpu=360 standard        cpu=720  cpu=360
 </code> </code>
  
-Any user can employ a GPU by running in the ''standard'' partition, however keep in mind jobs can be preempted and would require [[abstract:caviness:runjobs:schedule_jobs#handling-system-signals-aka-checkpointing|checkpointing]] as part of your batch job script.  The interactive session example below requests any node with a GPU v100, 1 core and 1 GB of memory (default values if not specified) on the standard partition.+Any user can employ a GPU by running in the ''standard'' partition, however keep in mind a GPU type **must** be specified, jobs can be preempted and would require [[abstract:caviness:runjobs:schedule_jobs#handling-system-signals-aka-checkpointing|checkpointing]] as part of your batch job script.  The interactive session example below requests any node with (2) GPUs v100 type, 1 core1 GB of memory and 30 minutes of time (default values if not specified) on the ''standard'' partition.
  
 <code bash> <code bash>
-salloc --partition=standard --gres=gpu:v100+salloc --partition=standard --gres=gpu:v100:2
 </code> </code>
  
-to allocate any type of GPU node available for your interactive job in the standard partition. +If you are unsure of what GPU types are available when using the ''standard'' partition, see [[abstract:caviness:caviness#compute-nodes|Compute Nodes]] on Caviness.
 ==== Enhanced Local Scratch nodes ==== ==== Enhanced Local Scratch nodes ====
  
  • abstract/caviness/runjobs/schedule_jobs.txt
  • Last modified: 2024-05-23 11:32
  • by anita