====== Revisions to Slurm Configuration v2.3.2 on Caviness ====== This document summarizes alterations to the Slurm job scheduler configuration on the Caviness cluster. ===== Issues ===== ==== Caviness Expansion 3 ==== Two new racks (r05, r06) has been added to the Caviness cluster. Nodes in the new rack must be integrated into the Slurm configuration for job scheduling. First-time investing workgroups must be added to Slurm accounting, and all workgroups' QOS-based resource limits and fairshare factors must be updated. ===== Implementation ===== * The Slurm ''nodes.conf'' file will be modified to include r05, r06. * The Slurm ''partitions.conf'' file will be modified to: * Adjust node assignments for existing workgroups who purchased node(s) in r05, r06 * Add new workgroups who purchased node(s) in r05, r06 * The Slurm ''topology.conf'' file will be modified to include OPA switches/HFIs in r05, r06 * The ''/opt/shared/slurm/add-ons/bin/opa2slurm'' utility (written by IT-RCI staff) will be used to automatically map the OPA network * The Slurm accounting database will be updated: * New workgroups added and populated with members of the workgroup * For each workgroup update calculated fairshare fraction (dollar percentage of workgroup investment) * For each workgroup update workgroup-partition maximum CPU/memory/GPU limit ===== Impact ===== No downtime is expected to be required. The version of the configuration will be bumped to v2.4.0. ===== Timeline ===== ^Date ^Time ^Goal/Description ^ |2023-05-27| |Authoring of this document| |2023-05-30|09:00|Implementation|