technical:slurm:darwin:templates:start

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Next revision
Previous revision
technical:slurm:darwin:templates:start [2019-06-25 16:57] – external edit 127.0.0.1technical:slurm:darwin:templates:start [2023-11-27 16:50] (current) – old revision restored (2021-04-27 16:21) frey
Line 1: Line 1:
-====== Caviness Slurm Job Script Templates ======+====== DARWIN Slurm Job Script Templates ======
  
-On both Mills and Farber, example job scripts were made available under the ''/opt/templates/gridengine'' directory.  Each template contained extensive Bash code to sense and setup the job environment.  Since the script templates were thus self-contained, they had no external dependencies. +As on Caviness, environment sense and setup code has been shifted out of the job script templates and into external script fragments that are sourced (executed) by the job script.  What remains in the job script templates is the setting of variables that influence those external fragments' execute and the sourcing of them.  When IT-RCI must change the behavior of job scripts, the external fragments are modified and the change is effected for all job scripts deriving from the templates.
- +
-One major problem with this paradigm comes when changes must be made to templates.  Since users make copies of the template, they are responsible for merging any ongoing change into their copies.  In practice, this simply does not happen:  a working job script is reused over and over again without modification. +
- +
-On Caviness, the job script templates have been structured differently.  Most of the environment sense and setup code has been shifted out of the job script templates and into external script fragments that are sourced (executed) by the job script.  What remains in the job script templates is the setting of variables that influence those external fragments' execute and the sourcing of them.  Now, when IT must change the behavior of job scripts, the external fragments are modified and the change is effected for all job scripts deriving from the templates.+
  
 ===== Where Can I Find Them? ===== ===== Where Can I Find Them? =====
  
-IT staff are maintaining the Caviness templates via git.  The production copy of the repository is checked-out in ''/opt/shared/slurm/templates'' with a symbolic link present at ''/opt/templates/slurm'' to maintain parity with the Mills and Farber systems.+IT-RCI staff are maintaining the DARWIN templates via git.  The production copy of the repository is checked-out in ''/opt/shared/slurm/templates'' with a symbolic link present at ''/opt/shared/templates/slurm'' to maintain parity with other HPC systems.
  
 The external script fragments (mentioned above and discussed in detail below) can be found in the ''/opt/shared/slurm/templates/libexec'' directory. The external script fragments (mentioned above and discussed in detail below) can be found in the ''/opt/shared/slurm/templates/libexec'' directory.
  
-The ''/opt/templates/slurm'' symbolic link points to the ''/opt/shared/slurm/templates/job-scripts'' directory, which is organized into distinct job classes.  The top level is split into ''applications'' and ''generic''.+The ''/opt/shared/templates/slurm'' symbolic link points to the ''/opt/shared/slurm/templates/job-scripts'' directory, which is organized into distinct job classes.  The top level is split into ''applications'' and ''generic''.
  
 ==== Applications ==== ==== Applications ====
  
-Software packages that have unique runtime requirements will have a single script or a directory of scripts located in the ''applications'' directory.  TensorFlow is a good example:  Caviness uses Linux containers (created by Google and distributed via Docker) to execute TensorFlow scripts on compute nodes.  Gaussian requires extra work to tailor input files and its own expected environment variables to match the resources allocated to the job by Slurm.+Software packages that have unique runtime requirements will have a single script or a directory of scripts located in the ''applications'' directory.  TensorFlow is a good example:  DARWIN uses Linux containers (created by Google and distributed via Docker) to execute TensorFlow scripts on compute nodes.  Gaussian requires extra work to tailor input files and its own expected environment variables to match the resources allocated to the job by Slurm.
  
 ==== Generic ==== ==== Generic ====
  • technical/slurm/darwin/templates/start.1561496243.txt.gz
  • Last modified: 2019-06-25 16:57
  • by 127.0.0.1