technical:whitepaper:start

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
technical:whitepaper:start [2018-12-10 11:00] freytechnical:whitepaper:start [2022-06-17 12:25] (current) – [rJava: when compilers get too smart] frey
Line 2: Line 2:
  
 {{ :technical:whitepaper:1344435822_27-edit_text.png?128|}}Some of the content in this area will be in PDF format and may need to be downloaded before being read. {{ :technical:whitepaper:1344435822_27-edit_text.png?128|}}Some of the content in this area will be in PDF format and may need to be downloaded before being read.
 +
 +===== rJava:  When Compilers Get Too Smart =====
 +
 +While installing all 16k (nearly 17k) CRAN packages on a recent R 4.1.3 build, many packages with a dependency on rJava would hang when being tested.  [[technical:whitepaper:rJava-gcc-optimization|GDB debugging and analysis of both the C source and runtime assembly code]] revealed an interesting problem with GCC 11.2's compilation of the code.
 +
 +===== Open MPI, PSM2, and MPI_Comm_spawn() =====
 +
 +The MPI process-spawning API has not been frequently used on our clusters.  A user reported an issue with the Rmpi library and example code that spawns R workers via MPI_Comm_spawn() on the Caviness cluster.  The issue was debugged and addressed for all pertinent versions of Open MPI, and is [[technical:whitepaper:openmpi-psm2-spawn|summarized here]].
 +
 +===== Mellanox UCX and Open MPI on DARWIN =====
 +
 +During early-access testing of the DARWIN cluster several users reported issues with their MPI jobs' crashing unexpectedly in code locations that worked on previous clusters (like Caviness).  The [[technical:whitepaper:darwin_ucx_openmpi|full troubleshooting and mitigation]] of the issue should be instructive for DARWIN users who attempt to build and manage their own Open MPI libraries on DARWIN.
 +
 +===== /dev/shm exhaustion =====
 +
 +As time goes by, the ''/dev/shm'' filesystem on compute nodes can fill with orphaned files.  Without swap matching the amount of RAM in the node, these files will begin putting pressure on subsequent applications that run on the node.  In [[technical:whitepaper:automated_devshm_cleanup|Automated /dev/shm cleanup]], a method of removing orphaned files from ''/dev/shm'' is outlined.
  
 ===== R: runtime configurable BLAS/LAPACK ===== ===== R: runtime configurable BLAS/LAPACK =====
Line 58: Line 74:
  
   * [[hpcc open64 acml|open64 compiler with ACML and openmpi libraries]]   * [[hpcc open64 acml|open64 compiler with ACML and openmpi libraries]]
- 
  
  • technical/whitepaper/start.1544457615.txt.gz
  • Last modified: 2018-12-10 11:00
  • by frey