technical:whitepaper:start

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Next revisionBoth sides next revision
technical:whitepaper:start [2018-12-10 11:00] freytechnical:whitepaper:start [2021-02-12 16:01] frey
Line 2: Line 2:
  
 {{ :technical:whitepaper:1344435822_27-edit_text.png?128|}}Some of the content in this area will be in PDF format and may need to be downloaded before being read. {{ :technical:whitepaper:1344435822_27-edit_text.png?128|}}Some of the content in this area will be in PDF format and may need to be downloaded before being read.
 +
 +===== Mellanox UCX and Open MPI on DARWIN =====
 +
 +During early-access testing of the DARWIN cluster several users reported issues with their MPI jobs' crashing unexpectedly in code locations that worked on previous clusters (like Caviness).  The [[technical:whitepaper:darwin_ucx_openmpi|full troubleshooting and mitigation]] of the issue should be instructive for DARWIN users who attempt to build and manage their own Open MPI libraries on DARWIN.
 +
 +===== /dev/shm exhaustion =====
 +
 +As time goes by, the ''/dev/shm'' filesystem on compute nodes can fill with orphaned files.  Without swap matching the amount of RAM in the node, these files will begin putting pressure on subsequent applications that run on the node.  In [[technical:whitepaper:automated_devshm_cleanup|Automated /dev/shm cleanup]], a method of removing orphaned files from ''/dev/shm'' is outlined.
  
 ===== R: runtime configurable BLAS/LAPACK ===== ===== R: runtime configurable BLAS/LAPACK =====
Line 58: Line 66:
  
   * [[hpcc open64 acml|open64 compiler with ACML and openmpi libraries]]   * [[hpcc open64 acml|open64 compiler with ACML and openmpi libraries]]
- 
  
  • technical/whitepaper/start.txt
  • Last modified: 2022-06-17 12:25
  • by frey