<html>
    <head>
      <base href="https://llvm.org/bugs/" />
    </head>
    <body><table border="1" cellspacing="0" cellpadding="8">
        <tr>
          <th>Bug ID</th>
          <td><a class="bz_bug_link 
          bz_status_NEW "
   title="NEW --- - AVX512: Register allocator doesn't understand mask registers"
   href="https://llvm.org/bugs/show_bug.cgi?id=28839">28839</a>
          </td>
        </tr>

        <tr>
          <th>Summary</th>
          <td>AVX512: Register allocator doesn't understand mask registers
          </td>
        </tr>

        <tr>
          <th>Product</th>
          <td>libraries
          </td>
        </tr>

        <tr>
          <th>Version</th>
          <td>trunk
          </td>
        </tr>

        <tr>
          <th>Hardware</th>
          <td>PC
          </td>
        </tr>

        <tr>
          <th>OS</th>
          <td>All
          </td>
        </tr>

        <tr>
          <th>Status</th>
          <td>NEW
          </td>
        </tr>

        <tr>
          <th>Severity</th>
          <td>normal
          </td>
        </tr>

        <tr>
          <th>Priority</th>
          <td>P
          </td>
        </tr>

        <tr>
          <th>Component</th>
          <td>Backend: X86
          </td>
        </tr>

        <tr>
          <th>Assignee</th>
          <td>unassignedbugs@nondot.org
          </td>
        </tr>

        <tr>
          <th>Reporter</th>
          <td>wenzel.jakob@epfl.ch
          </td>
        </tr>

        <tr>
          <th>CC</th>
          <td>llvm-bugs@lists.llvm.org
          </td>
        </tr>

        <tr>
          <th>Classification</th>
          <td>Unclassified
          </td>
        </tr></table>
      <p>
        <div>
        <pre>The LLVM register allocator fails to effectively deal with mask registers and
sometimes generates bizarre sequences of moves from mask to integer registers
(and back!).

Consider the following code fragment compiled with the HEAD revision of
LLVM/Clang:

#include <immintrin.h>

__mmask16 combine(__m512 a, __m512 b, __m512 c, __m512 d, __m512 x) {
    __mmask16 m1 = _mm512_cmp_ps_mask(a, x, _CMP_GE_OS);
    __mmask16 m2 = _mm512_cmp_ps_mask(b, x, _CMP_GE_OS);
    __mmask16 m3 = _mm512_cmp_ps_mask(c, x, _CMP_GE_OS);
    __mmask16 m4 = _mm512_cmp_ps_mask(d, x, _CMP_GE_OS);

    return _mm512_kor(_mm512_kor(m1, m2), _mm512_kor(m3, m4));
}

This is what I get (clang++ -mavx512f test.cpp -o test.s -O3 -S
-fomit-frame-pointer):

__Z7combineDv16_fS_S_S_S_:              ## @_Z7combineDv16_fS_S_S_S_
    vcmpgeps    %zmm4, %zmm0, %k0
    kmovw    %k0, %eax
    vcmpgeps    %zmm4, %zmm1, %k0
    kmovw    %k0, %ecx
    vcmpgeps    %zmm4, %zmm2, %k0
    kmovw    %k0, %edx
    vcmpgeps    %zmm4, %zmm3, %k0
    kmovw    %k0, %esi
    kmovw    %ecx, %k0
    kmovw    %eax, %k1
    korw    %k0, %k1, %k0
    kmovw    %esi, %k1
    kmovw    %edx, %k2
    korw    %k1, %k2, %k1
    korw    %k1, %k0, %k0
    kmovw    %k0, %eax
    movzwl    %ax, %eax
    retq


Note all the unnecessary 'kmov' instructions, and that vcmpgeps only seems to
be able to put its output into the 'k0' register.</pre>
        </div>
      </p>
      <hr>
      <span>You are receiving this mail because:</span>
      
      <ul>
          <li>You are on the CC list for the bug.</li>
      </ul>
    </body>
</html>