====== Alterations to the workgroup Command ======
Users of the IT-RCI clusters have a primary gid (group id) of //everyone// (or //900//), but when submitting and executing jobs the user needs an effective gid other than that. The ''workgroup'' command is used to execute a command or spawn a new shell that has an effective gid of a secondary group of which the user is a member. For example, a member of the ''it_nss'' workgroup on Caviness can do the following:
[user@login00 ~]$ id -gn
everyone
[user@login00 ~]$ workgroup -g it_nss
[(it_nss:user)@login00 ~]$ id -gn
it_nss
At login the effective gid was //everyone//. After issuing the ''workgroup'' command, the new shell has effective gid of //it_nss// -- which is reflected in the shell prompt, as well.
===== Issues =====
One issue with the ''workgroup'' command is in the inheritance of the environment. The Bash shell environment includes unexported variables, aliases, array-valued variables, and functions that are not a part of the POSIX environment that sub-processes will inherit:
[user@login00 ~]$ bash_function() { echo "OK"; }
[user@login00 ~]$ declare -a var_array
[user@login00 ~]$ var_array+=(1 2 3)
[user@login00 ~]$ bash_function
OK
[user@login00 ~]$ echo ${var_array[@]}
1 2 3
[user@login00 ~]$ workgroup -g it_nss
[(it_nss:user)@login00 ~]$ bash_function
bash: bash_function: command not found...
[(it_nss:user)@login00 ~]$ echo ${var_array[@]}
[(it_nss:user)@login00 ~]$
A tool like VALET that makes alterations to the current shell's environment may edit standard variables like ''PATH'' -- which will carry over into the workgroup shell -- but may also introduce Bash-specific entities that do not. This leaves the workgroup shell in an odd hybrid state:
[user@login00 ~]$ vpkg_require r/default
Adding dependency `gcc/4.9.4` to your environment
Adding dependency `atlas/3.10.3` to your environment
Adding package `r/3.5.1` to your environment
[user@login00 ~]$ R-search
Library Valet Package R Versn
-------------------- ----------------------- -------
:
[user@login00 ~]$ workgroup -g it_nss
[(it_nss:user)@login00 ~]$ R-search
bash: R-search: command not found...
[(it_nss:user)@login00 ~]$ vpkg_require r/default
[(it_nss:user)@login00 ~]$ R-search
bash: R-search: command not found...
[(it_nss:user)@login00 ~]$ vpkg_rollback
ERROR: no environment snapshots for the current shell, unable to roll back
The ''R-search'' command is an alias, so it does not make it to the workgroup shell. The environment variables do, though, and so VALET sees ''r/default'' has already been loaded and will not do it again. Since the new shell has a different VALET identity no snapshots are present to allow ''vpkg_rollback'' to remove the changes.
In the past users were cautioned to use the ''workgroup'' command prior to introducing any packages into the runtime environment:
[(it_nss:user)@login00 ~]$ exit
[user@login00 ~]$ vpkg_rollback all
[user@login00 ~]$ /opt/shared/workgroup/bin/workgroup -g it_nss
[(it_nss:user)@login00 ~]$ vpkg_require r/default
Adding dependency `gcc/4.9.4` to your environment
Adding dependency `atlas/3.10.3` to your environment
Adding package `r/3.5.1` to your environment
[(it_nss:user)@login00 ~]$ R-search
Library Valet Package R Versn
-------------------- ----------------------- -------
:
Ideally, having the ''workgroup'' command start the new shell with a pristine environment devoid of all modifications, augmented by the standard login scripts (e.g. ''~/.bash_profile''), would be far more useful.
===== Solution =====
To get around the standard behavior of a subprocess' inheriting the POSIX environment variables of its parent, the ''env'' command can be used:
NAME
env - run a program in a modified environment
SYNOPSIS
env [OPTION]... [-] [NAME=VALUE]... [COMMAND [ARG]...]
DESCRIPTION
Set each NAME to VALUE in the environment and run COMMAND.
Mandatory arguments to long options are mandatory for short options too.
-i, --ignore-environment
start with an empty environment
Rather than having ''workgroup'' execute the ''newgrp'' command directly, it can be executed indirectly by the ''env'' command:
/usr/bin/env -i /usr/bin/newgrp - it_nss
A modified version of the ''workgroup'' command was produced, and minor alterations were made to the Caviness cluster's login scripts to ensure the workgroup prompt is still set as expected. To test:
[user@login00 ~]$ vpkg_require r/default
Adding dependency `gcc/4.9.4` to your environment
Adding dependency `atlas/3.10.3` to your environment
Adding package `r/3.5.1` to your environment
[user@login00 ~]$ workgroup -g it_nss
[(it_nss:user)@login00 ~]$ which R
/usr/bin/which: no R in (/opt/shared/workgroup/20200723/bin:/home/1001/bin:/opt/shared/valet/2.1/bin/bash:/opt/shared/valet/2.1/bin:/opt/shared/slurm/add-ons/bin:/opt/shared/slurm/bin:/usr/lib64/qt-3.3/bin:/opt/shared/gqueue/bin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin)
[(it_nss:user)@login00 ~]$ vpkg_history
[(it_nss:user)@login00 ~]$ vpkg_require r/default
Adding dependency `gcc/4.9.4` to your environment
Adding dependency `atlas/3.10.3` to your environment
Adding package `r/3.5.1` to your environment
[(it_nss:user)@login00 ~]$ which R
/opt/shared/r/3.5.1/bin/R
The workgroup shell starts with a clean environment: the ''vpkg_require r/default'' from the original shell did not carry over into the workgroup shell. The workgroup shell's prompt is working as before. This is the desired behavior.
Note that this is **not** the desired behavior when ''workgroup'' is used to execute commands (using the ''-C'' flag). In that mode, all variables in the parent shell will be passed to the command when it is executed.
===== Implementation =====
Updates to the code have been pushed to the [[https://gitlab.com/jtfrey/workgroup|official repository]]. The updated version of the ''workgroup'' command is available under ''/opt/shared/workgroup/20200723/bin'' on Caviness and Farber for testing.
===== Timeline =====
^Date ^Time ^Goal/Description ^
|2020-07-23| |Authoring of this document|
|2020-07-30|09:00|Activation on Caviness and Farber|