abstract:darwin:runjobs:schedule_jobs

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
abstract:darwin:runjobs:schedule_jobs [2021-04-27 20:01] – [Launching GUI Applications (X11 Forwarding)] anitaabstract:darwin:runjobs:schedule_jobs [2023-03-20 14:27] (current) – [Handling System Signals aka Checkpointing] anita
Line 13: Line 13:
  
 Need help? See [[http://www.hpc.udel.edu/presentations/intro_to_slurm/|Introduction to Slurm]] in UD's HPC community cluster environment. Need help? See [[http://www.hpc.udel.edu/presentations/intro_to_slurm/|Introduction to Slurm]] in UD's HPC community cluster environment.
 +
 +<note warning>**IMPORTANT:** When a job is submitted, the SUs will be calculated and pre-debited based on the resources requested thereby putting a hold on and deducting the SUs from the allocation credit for your project/workgroup. However, once the job completes the amount of SUs debited will be based on the actual time used. Keep in mind that if you request 20 cores and your job really only takes advantage of 10 cores, then the job will still be billed based on the requested 20 cores.  And specifying a time limit of 2 days versus 2 hours may prevent others in your project/workgroup from running jobs as those SUs will be unavailable until the job completes. On the other hand, if you do not request enough resources and your job fails (i.e. did not provide enough time, enough cores, etc.), you will still be billed for those SUs. See [[abstract:darwin:runjobs:schedule_jobs#command-options|Scheduling Jobs Command options]] for help with specifying resources and [[abstract:darwin:runjobs:accounting|Job Accounting]] for details on SU calculations.
 +
 +**Moral of the story:** Request only the resources needed for your job.  Over or under requesting resources results in wasting your allocation credits for everyone in your project/workgroup.</note>
 +
 +<note important>**Interactive jobs:** An interactive job is billed the SU associated with the full wall time of its execution, not just for CPU time accrued through its duration.  For example, if you leave an interactive job running for 2 hours and execute code for 2 minutes, your allocation will be billed for 2 hours of time, not 2 minutes.  Please review [[abstract:darwin:runjobs:accounting|job accounting]] to determine the SU associated for each type of resource requested (compute, gpu) and the SUs billed per hour.</note>
  
 ===== Interactive jobs (salloc) ===== ===== Interactive jobs (salloc) =====
Line 229: Line 235:
 |Extra-Large Memory/2 TiB   |%%--%%partition=xlarge-mem                                   2031616|      1984| |Extra-Large Memory/2 TiB   |%%--%%partition=xlarge-mem                                   2031616|      1984|
 |nVidia-T4/512 GiB          |%%--%%partition=gpu-t4                                        499712|       488| |nVidia-T4/512 GiB          |%%--%%partition=gpu-t4                                        499712|       488|
-|nVidia-V100/768 GiB        |%%--%%partition=gpi-v100                                      737280|       720|+|nVidia-V100/768 GiB        |%%--%%partition=gpu-v100                                      737280|       720|
 |amd-MI50/512 GiB           |%%--%%partition=gpu-mi50                                      499712|       488| |amd-MI50/512 GiB           |%%--%%partition=gpu-mi50                                      499712|       488|
 |Extended Memory/3.73 TiB   |%%--%%partition=extended-mem %%--exclusive%%                  999424|       976| |Extended Memory/3.73 TiB   |%%--%%partition=extended-mem %%--exclusive%%                  999424|       976|
Line 251: Line 257:
 </note> </note>
  
-For [[abstract:darwin:runjobs:schedule_jobs#gpu-nodes|GPU nodes]], you must also specify one of the GPU resource option flags, otherwise your job will not be allocated any GPUs.+For [[abstract:darwin:runjobs:schedule_jobs#gpu-nodes|GPU nodes]], you must also specify one of the GPU resource option flags, otherwise your job will not be permitted to run in the GPU partitions.
  
 ==== Exclusive access ==== ==== Exclusive access ====
Line 276: Line 282:
 ==== GPU nodes ==== ==== GPU nodes ====
  
-Jobs that will run in one of the GPU partitions can request GPU resources using ONE of the following flags:+Jobs that will run in one of the GPU partitions must request GPU resources using ONE of the following flags:
  
 ^Flag^Description^ ^Flag^Description^
Line 284: Line 290:
 |''%%--%%gpus-per-task=<count>''|<count> GPUs are required for each task in the job| |''%%--%%gpus-per-task=<count>''|<count> GPUs are required for each task in the job|
  
-If you do not specify one of these flags, your job will not be allocated any GPUs.+If you do not specify one of these flags, your job will not be permitted to run in the GPU partitions.
  
 <note important>On DARWIN the ''%%--%%gres'' flag should NOT be used to request GPU resources.  The GPU type will be inferred from the partition to which the job is submitted if not specified.</note> <note important>On DARWIN the ''%%--%%gres'' flag should NOT be used to request GPU resources.  The GPU type will be inferred from the partition to which the job is submitted if not specified.</note>
Line 478: Line 484:
  
 The name provided with the ''%%-%%-job-name'' command-line option will be assigned to the interactive session/job that the user started versus the default name ''interact''. See [[abstract/darwin/runjobs/job_status|Managing Jobs]] on the <html><span style="color:#ffffff;background-color:#2fa4e7;padding:3px 7px !important;border-radius:4px;">sidebar</span></html> for general information about commands in Slurm to manage all your jobs on DARWIN. The name provided with the ''%%-%%-job-name'' command-line option will be assigned to the interactive session/job that the user started versus the default name ''interact''. See [[abstract/darwin/runjobs/job_status|Managing Jobs]] on the <html><span style="color:#ffffff;background-color:#2fa4e7;padding:3px 7px !important;border-radius:4px;">sidebar</span></html> for general information about commands in Slurm to manage all your jobs on DARWIN.
 +
 +===== Launching GUI Applications (VNC for X11 Applications) =====
 +
 +Please review [[ technical:recipes/vnc-usage | using VNC for X11 Applications]] as an alternative to X11 Forwarding.
 +
  
 ===== Launching GUI Applications (X11 Forwarding) ===== ===== Launching GUI Applications (X11 Forwarding) =====
Line 527: Line 538:
 </code> </code>
  
-This will launch an interactive session on one of the compute nodes in the ''standard'' partition, in this case ''r1n02'', with default options of one cpu (core), 1 GB of memory and 30 minutes time.+This will launch an interactive job on one of the compute nodes in the ''standard'' partition, in this case ''r1n02'', with default options of one cpu (core), 1 GB of memory and 30 minutes time.
  
-Now the session and environment will be ready to launch any program that has a GUI (Graphical User Interface) and be displayed on your local computer display.+Now the compute node and environment will be ready to launch any program that has a GUI (Graphical User Interface) and be displayed on your local computer display.
  
 <note important> <note important>
Line 599: Line 610:
 ==== Handling System Signals aka Checkpointing ==== ==== Handling System Signals aka Checkpointing ====
  
-Generally, there are two possible cases when jobs are killed: (1) preemption and (2) walltime configured within the jobs script has elapsed. Checkpointing can be used to intercept and handle the system signals in each of these cases to write out a restart file, perform the cleanup or backup operations, or any other tasks before the job gets killed. Of course this depends on whether or not the application or software you are using is checkpoint enabled.+Generally, there are two possible cases when jobs are killed: (1) preemption and (2) walltime configured within the jobs script has elapsed. Checkpointing can be used to intercept and handle the system signals in each of these cases to write out a restart file, perform the cleanup or backup operations, or any other tasks before the job gets killed. Of coursethis depends on whether or not the application or software you are using is checkpoint enabled.
  
 <note important>Please review the comments provided in the Slurm job script templates available in ''/opt/shared/templates'' that demonstrates the ways to trap these signals.</note>  <note important>Please review the comments provided in the Slurm job script templates available in ''/opt/shared/templates'' that demonstrates the ways to trap these signals.</note> 
Line 605: Line 616:
 "TERM" is the most common system signal that is triggered in both the above cases. However, there is a working logic behind the preemption of job which works as below. "TERM" is the most common system signal that is triggered in both the above cases. However, there is a working logic behind the preemption of job which works as below.
  
-When a job gets submitted to a workgroup-specific partition and resources are tied-up by jobs in the ''standard'' partition, the jobs in the ''standard'' partition will be preempted to make way.  Slurm sends a preemption signal to the job (SIGCONT followed by SIGTERM) then waits for a grace period (5 minutes) before signaling again (SIGCONT followed by SIGTERM) then killing it (SIGKILL).  However, if the job is able to simply be re-run as-is, the user can submit with ''%%-%%-requeue'' to indicate that ''standard'' job that was preempted should be rerun on the ''standard'' partition (possibly restarting immediately on different nodes, otherwise it will need to wait for resources to become available).+When a job gets submitted to a workgroup-specific partition and resources are tied up by jobs in the ''idle'' partition, the jobs in the ''idle'' partition will be preempted to make way.  Slurm sends a preemption signal to the job (SIGCONT followed by SIGTERM) then waits for a grace period (5 minutes) before signaling again (SIGCONT followed by SIGTERM) then killing it (SIGKILL).  However, if the job is able to simply be re-run as-is, the user can submit with ''%%-%%-requeue'' to indicate that an ''idle'' job that was preempted should be rerun on the ''idle'' partition (possibly restarting immediately on different nodes, otherwise it will need to wait for resources to become available).
  
-For example using the logic provided in one of the Slurm job script templates, one can catch these signals during the preemption and handle them by performing the cleanup or backing up the job results operations as follows. +For exampleusing the logic provided in one of the Slurm job script templates, one can catch these signals during the preemption and handle them by performing the cleanup or backing up the job results operations as follows. 
  
 <code bash> <code bash>
 +#SBATCH --partition=idle
 #SBATCH --job-name="atest" #SBATCH --job-name="atest"
 #SBATCH --nodes=1 #SBATCH --nodes=1
Line 835: Line 847:
 |''SLURM_TASKS_PER_NODE''|Number of tasks to be initiated on each node| |''SLURM_TASKS_PER_NODE''|Number of tasks to be initiated on each node|
  
- +The mechanism by which you can spread your job across nodes is a bit more complex.  If your MPI job wants N CPUs and you're willing to have as few as M of them running per node, then the maximum node count is µ=⌈N/M⌉.
-Keep in mind, Slurm defaults to a node count of 1 on any submitted job, so the mechanism by which you can spread your job across more nodes is a bit more complex. +
- +
-In essence, if your MPI job wants N CPUs and you're willing to have as few as M of them running per node, then the maximum node count is µ=⌈N/M⌉.+
  
 <code> <code>
  • abstract/darwin/runjobs/schedule_jobs.1619568088.txt.gz
  • Last modified: 2021-04-27 20:01
  • by anita