abstract:caviness:runjobs:schedule_jobs

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
abstract:caviness:runjobs:schedule_jobs [2024-05-23 11:32] – [Memory] anitaabstract:caviness:runjobs:schedule_jobs [2024-09-25 14:10] (current) – [Array Jobs] anita
Line 668: Line 668:
  
  
-==== Array jobs ====+===== Array Jobs =====
  
 An array job essentially runs the same job by generating a new repeated task many times. Each time, the environment variable **SLURM_ARRAY_TASK_ID** is set to a unique value and its value provides input to the job submission script. An array job essentially runs the same job by generating a new repeated task many times. Each time, the environment variable **SLURM_ARRAY_TASK_ID** is set to a unique value and its value provides input to the job submission script.
Line 699: Line 699:
 %%--%%array=1,2,5,19,27 %%--%%array=1,2,5,19,27
 </WRAP> </WRAP>
 +
 +<note important>The default job array size limits are set to 10000 for Slurm on Caviness to avoid oversubscribing the scheduler node's own resource limits (causing scheduling to become sluggish or even unresponsive). See the [[technical:slurm:caviness:arraysize-and-nodecounts#job-array-size-limits|technical explanation]] for why this is necessary.
 +</note>
  
 For more details and information see [[abstract:caviness:runjobs:schedule_jobs#array-jobs1|Array Jobs]]. For more details and information see [[abstract:caviness:runjobs:schedule_jobs#array-jobs1|Array Jobs]].
-==== Chaining jobs ====+===== Chaining Jobs =====
  
 If you have a multiple jobs where you want to automatically run other job(s) after the execution of another job, then you can use chaining. When you chain jobs, remember to check the status of the other job to determine if it successfully completed. This will prevent the system from flooding the scheduler with failed jobs.  Here is a simple chaining example with three job scripts ''doThing1.qs'', ''doThing2.qs'' and ''doThing3.qs''. If you have a multiple jobs where you want to automatically run other job(s) after the execution of another job, then you can use chaining. When you chain jobs, remember to check the status of the other job to determine if it successfully completed. This will prevent the system from flooding the scheduler with failed jobs.  Here is a simple chaining example with three job scripts ''doThing1.qs'', ''doThing2.qs'' and ''doThing3.qs''.
Line 943: Line 946:
 Four sub-tasks are executed, numbered from 1 through 4.  The starting index must be greater than zero, and the ending index must be greater than or equal to the starting index.  The //step size// going from one index to the next defaults to one, but can be any positive integer greater than zero. A step size is appended to the sub-task range as in ''2-20:2'' -- proceed from 2 up to 20 in steps of 2, e.g. 2, 4, 6, 8, 10, et al. Four sub-tasks are executed, numbered from 1 through 4.  The starting index must be greater than zero, and the ending index must be greater than or equal to the starting index.  The //step size// going from one index to the next defaults to one, but can be any positive integer greater than zero. A step size is appended to the sub-task range as in ''2-20:2'' -- proceed from 2 up to 20 in steps of 2, e.g. 2, 4, 6, 8, 10, et al.
  
-<note important>The default [[technical:slurm:caviness/arraysize-and-nodecounts#job-array-size-limits|job array size limits]] for Slurm are used on Caviness to avoid oversubscribing the scheduler node's own resource limits (causing scheduling to become sluggish or even unresponsive). +<note important>The default job array size limits are set to 10000 for Slurm on Caviness to avoid oversubscribing the scheduler node's own resource limits (causing scheduling to become sluggish or even unresponsive). See the [[technical:slurm:caviness:arraysize-and-nodecounts#job-array-size-limits|technical explanation]] for why this is necessary.
 </note> </note>
 ==== Partitioning Job Data ==== ==== Partitioning Job Data ====
  • abstract/caviness/runjobs/schedule_jobs.1716478356.txt.gz
  • Last modified: 2024-05-23 11:32
  • by anita