technical:generic:openmpi-4-ucx-issue

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
technical:generic:openmpi-4-ucx-issue [2024-12-05 10:37] freytechnical:generic:openmpi-4-ucx-issue [2024-12-05 10:47] (current) – [How is it fixed] frey
Line 75: Line 75:
 </code> </code>
  
-The GCC compiler implements a ''%%__builtin_ctz()%%'' function that may directly produce machine-level assembly code (very fast) when the target ISA supports it.  The surrounding code in the UCX PML ensures the integer value is a power of 2 and non-zero, so the fix simplifies to+Adding the precondition that ''v ≠ 0'' removes the leading conditional 
 + 
 +<code c> 
 +int ctz(unsigned int v) 
 +
 +    int     l = 0; 
 +     
 +    while ( (v & 1) == 0 ) l++, v >>= 1; 
 +    return l; 
 +
 +</code> 
 + 
 +and if ''v'' is guaranteed to be a power of two -- implying a single bit is set -- the code becomes 
 + 
 +<code c> 
 +int ctz(unsigned int v) 
 +
 +    int     l = -1; 
 +     
 +    do { l++, v >>= 1; } while (v); 
 +    return l; 
 +
 +</code> 
 + 
 +The GCC compiler implements a ''%%__builtin_ctz()%%'' function that may directly produce machine-level assembly code (very fast) when the target ISA supports it.  The surrounding code in the UCX PML ensures the integer value is a power of 2 and non-zero, so the final form above is permissible:
  
 <code c> <code c>
  • technical/generic/openmpi-4-ucx-issue.1733413042.txt.gz
  • Last modified: 2024-12-05 10:37
  • by frey