====== Enhanced QLogin ====== Applications like Matlab or Mathematica with graphical user interfaces present additional challenges in cluster deployment scenarios. Since remote connections are usually not made directly to the compute nodes but NATed through the cluster head node or a router, passing the side-band X11 traffic is challenging. Luckily, SSH contains such functionality. If a user ''ssh'''es to a cluster head node from his desktop, a second ''ssh'' from there to a compute node will successfully tunnel X11 traffic from the compute node, back to the head node, and from there back to the user's desktop. The problem comes when you introduce Grid Engine for interactive job scheduling via the ''qlogin'' command. With the Mills cluster here at UD, from day one the behavior of ''qlogin'' was modified to use a script we provided versus the default; from ''qconf -sconf'': : qlogin_command /opt/shared/GridEngine/local/qlogin_ssh qlogin_daemon /usr/sbin/sshd -i rlogin_command builtin rlogin_daemon builtin rsh_command builtin rsh_daemon builtin : On the compute node the standard SSH daemon is used to accept the ''qlogin'' connection from the head node. On the head node, the ''qlogin'' connection is made by the ''/opt/shared/GridEngine/local/qlogin_ssh'' script: #!/bin/sh HOST=$1 PORT=$2 # # Ensure that the environment on the remote host will # match the working env here: # export SGE_QLOGIN_PWD=`/bin/pwd` if [ -z "$WORKGROUP" ]; then export WORKGROUP=`id -g -n` fi if [ -z "$WORKDIR" ]; then WORKDIR=`/opt/shared/workgroup/bin/workdir -g $WORKGROUP` if [ $? = 0 ]; then export WORKDIR fi fi if [ "x$DISPLAY" = "x" ]; then exec /usr/bin/ssh -p $PORT $HOST else exec /usr/bin/ssh -X -Y -p $PORT $HOST fi Grid Engine passes two arguments to the script: the hostname of the compute node and a TCP/IP port to use for the session. The ''SGE_QLOGIN_PWD'' variable is set to the working directory when the ''qlogin'' command was issued -- we'll see why in a moment. If ''DISPLAY'' is set, we assume that there is an X11 session active for the user and we tunnel it to the compute node; otherwise, a standard SSH session is opened. Note that we're using ''exec'' because there is nothing else for this script to do after opening the connection. ===== Acting More Like QRsh ===== The standard behavior of this ''qlogin'' is thus to open a connection to a compute node and (as usual for ''ssh'') leaves the user in his home directory and running under his default Unix group (on Mills, ''everyone''). Unfortunately, on Mills the idea is to have people transition into secondary groups in order to submit jobs to Grid Engine. The default ''qrsh'' under Grid Engine propagates the current Unix group to the compute node (if the sysadmin wants it to), so the user's compute node environment is more similar to that which was on the head node. The nature of the work environment being promoted on Mills dictates that ''qlogin'' sessions would work best if the remote environment: * uses the same Unix group that was active on the head node * uses the same working directory that was active on the head node To accomplish this, we first need to modify what environment variables ''ssh'' will pass to the compute node. On the head node, the following was added to ''/etc/ssh/ssh_config'': : SendEnv XMODIFIERS SendEnv WORKGROUP WORKDIR SGE_QLOGIN_PWD Likewise, the SSH daemon on the compute nodes must be configured to //accept// those variables in the remote environment: : AcceptEnv XMODIFIERS AcceptEnv WORKGROUP WORKDIR SGE_QLOGIN_PWD : With these three variables being passed to the compute node, a ''/etc/profile.d'' script can detect their presence and react accordingly by changing the working directory to ''SGE_QLOGIN_PWD'' and possibly changing the group associated with the process using the value of ''WORKGROUP'' and the ''workgroup'' command available on Mills (similar to ''newgrp'').