technical:slurm:darwin:salloc-default-cmd-fixup

Revisions to Slurm Configuration v1.0.8 on DARWIN

This document summarizes alterations to the Slurm job scheduler configuration on the DARWIN cluster.

When the salloc command is used without a script and arguments to it, the value configured in the InteractiveStepOptions key (in /etc/slurm/slurm.conf) provides the default command to execute on the allocated resources. For example:

[(workgroup:user)@login01.darwin ~]$ salloc --partition=idle
salloc: Granted job allocation 13065047
salloc: Waiting for resource configuration
salloc: Nodes r1n00 are ready for job
[(workgroup:user)@r1n00 ~]$

The default command as defined in the Slurm configuration mirrors the suggested default from the Slurm developers:

InteractiveStepOptions="-n1 -N1 --mpi=none --interactive --preserve-env --pty $SHELL"

Thus, the salloc illustrated above is equivalent to:

[(workgroup:user)@login01.darwin ~]$ salloc --partition=idle srun -n1 -N1 --mpi=none --interactive --preserve-env --pty $SHELL
salloc: Granted job allocation 13065047
salloc: Waiting for resource configuration
salloc: Nodes r00n56 are ready for job
[(workgroup:user)@r1n00 ~]$

There are several issues with this default command:

  1. The inclusion of -n1 -N1 limits the remote shell to accessing a single task of the allocation
  2. The remote shell ($SHELL, coming from the user's current environment) is not executed as a login shell
  3. The srun by default propagates the user's current environment variables to the remote node(s); we generally do not recommend this behavior on Caviness

On the first point, the Slurm allocation may have been for -N1 -n4 -c8 (one node, four tasks, eight CPUs per task), but the remote shell will only have access to one task (with eight CPUs). The user more likely anticipated the remote shell's having access to the full set of resources allocated on the primary node assigned to the job, akin to the batch step in submitted job scripts.

The second and third points may prevent some runtime environment setup from happening; this can be problematic when exported environment variables are reconstituted in the runtime environment by Slurm, but unexported variables, aliases, and functions are not restored. Our best-practice for job scripts is to send no environment variables from the submission environment to the runtime environment; ideally, the same should be observed for interactive sessions.

To address the issue of all resources' on the primary node not being made available to the remote shell, the node and task counts will be dropped from the InteractiveStepOptions.

The majority of command shells recognize the -l flag as requesting login shell behavior. Appending a -l flag to the InteractiveStepOptions should be sufficient.

Finally, with regard to environment variable propagation, adding --export=NONE to the InteractiveStepOptions would implement the best-practice we seek to promote, but that behavior cannot be overridden with command line flags to salloc or via the environment (with SLURM_EXPORT_ENV). The only possible override is for a user to opt to not use the InteractiveStepOptions and provide an explicit command, e.g. an srun lacking the --export flag that appears in InteractiveStepOptions. The desired best-practice must be assumed to be the dominant use case (and will correlate with official documentation, for example), so adding --export=NONE to the InteractiveStepOptions is the correct choice.

This yields an altered InteractiveStepOptions of:

InteractiveStepOptions="--mpi=none --interactive --preserve-env --pty --export=NONE $SHELL -l"

No downtime is necessary since this change affects the behavior of the salloc command (not any of the Slurm daemons). The new configuration will be pushed to all nodes and take effect immediately.

Date Time Goal/Description
2022-01-12 Authoring of this document
2022-01-1909:00Implementation
  • technical/slurm/darwin/salloc-default-cmd-fixup.txt
  • Last modified: 2022-01-14 13:22
  • by anita