This document summarizes alterations to the Slurm job scheduler configuration on the DARWIN cluster.
When the salloc
command is used without a script and arguments to it, the value configured in the InteractiveStepOptions
key (in /etc/slurm/slurm.conf) provides the default command to execute on the allocated resources. For example:
[(workgroup:user)@login01.darwin ~]$ salloc --partition=idle salloc: Granted job allocation 13065047 salloc: Waiting for resource configuration salloc: Nodes r1n00 are ready for job [(workgroup:user)@r1n00 ~]$
The default command as defined in the Slurm configuration mirrors the suggested default from the Slurm developers:
InteractiveStepOptions="-n1 -N1 --mpi=none --interactive --preserve-env --pty $SHELL"
Thus, the salloc
illustrated above is equivalent to:
[(workgroup:user)@login01.darwin ~]$ salloc --partition=idle srun -n1 -N1 --mpi=none --interactive --preserve-env --pty $SHELL salloc: Granted job allocation 13065047 salloc: Waiting for resource configuration salloc: Nodes r00n56 are ready for job [(workgroup:user)@r1n00 ~]$
There are several issues with this default command:
-n1 -N1
limits the remote shell to accessing a single task of the allocation$SHELL
, coming from the user's current environment) is not executed as a login shellsrun
by default propagates the user's current environment variables to the remote node(s); we generally do not recommend this behavior on Caviness
On the first point, the Slurm allocation may have been for -N1 -n4 -c8
(one node, four tasks, eight CPUs per task), but the remote shell will only have access to one task (with eight CPUs). The user more likely anticipated the remote shell's having access to the full set of resources allocated on the primary node assigned to the job, akin to the batch step in submitted job scripts.
The second and third points may prevent some runtime environment setup from happening; this can be problematic when exported environment variables are reconstituted in the runtime environment by Slurm, but unexported variables, aliases, and functions are not restored. Our best-practice for job scripts is to send no environment variables from the submission environment to the runtime environment; ideally, the same should be observed for interactive sessions.
To address the issue of all resources' on the primary node not being made available to the remote shell, the node and task counts will be dropped from the InteractiveStepOptions
.
The majority of command shells recognize the -l
flag as requesting login shell behavior. Appending a -l
flag to the InteractiveStepOptions
should be sufficient.
Finally, with regard to environment variable propagation, adding --export=NONE
to the InteractiveStepOptions
would implement the best-practice we seek to promote, but that behavior cannot be overridden with command line flags to salloc
or via the environment (with SLURM_EXPORT_ENV
). The only possible override is for a user to opt to not use the InteractiveStepOptions
and provide an explicit command, e.g. an srun
lacking the --export
flag that appears in InteractiveStepOptions
. The desired best-practice must be assumed to be the dominant use case (and will correlate with official documentation, for example), so adding --export=NONE
to the InteractiveStepOptions
is the correct choice.
This yields an altered InteractiveStepOptions
of:
InteractiveStepOptions="--mpi=none --interactive --preserve-env --pty --export=NONE $SHELL -l"
No downtime is necessary since this change affects the behavior of the salloc
command (not any of the Slurm daemons). The new configuration will be pushed to all nodes and take effect immediately.
Date | Time | Goal/Description |
---|---|---|
2022-01-12 | Authoring of this document | |
2022-01-19 | 09:00 | Implementation |