<html>
    <head>
      <base href="https://bugs.llvm.org/">
    </head>
    <body><table border="1" cellspacing="0" cellpadding="8">
        <tr>
          <th>Bug ID</th>
          <td><a class="bz_bug_link 
          bz_status_NEW "
   title="NEW - wrong codegen due to _mm_mpsadbw_epu8 intrinsic incorrectly marked as commutative"
   href="https://bugs.llvm.org/show_bug.cgi?id=51908">51908</a>
          </td>
        </tr>

        <tr>
          <th>Summary</th>
          <td>wrong codegen due to _mm_mpsadbw_epu8 intrinsic incorrectly marked as commutative
          </td>
        </tr>

        <tr>
          <th>Product</th>
          <td>libraries
          </td>
        </tr>

        <tr>
          <th>Version</th>
          <td>trunk
          </td>
        </tr>

        <tr>
          <th>Hardware</th>
          <td>PC
          </td>
        </tr>

        <tr>
          <th>OS</th>
          <td>All
          </td>
        </tr>

        <tr>
          <th>Status</th>
          <td>NEW
          </td>
        </tr>

        <tr>
          <th>Severity</th>
          <td>normal
          </td>
        </tr>

        <tr>
          <th>Priority</th>
          <td>P
          </td>
        </tr>

        <tr>
          <th>Component</th>
          <td>Backend: X86
          </td>
        </tr>

        <tr>
          <th>Assignee</th>
          <td>unassignedbugs@nondot.org
          </td>
        </tr>

        <tr>
          <th>Reporter</th>
          <td>benjsith@gmail.com
          </td>
        </tr>

        <tr>
          <th>CC</th>
          <td>craig.topper@gmail.com, llvm-bugs@lists.llvm.org, llvm-dev@redking.me.uk, pengfei.wang@intel.com, spatel+llvm@rotateright.com
          </td>
        </tr></table>
      <p>
        <div>
        <pre>I came across a case where using the Intel SSE4.1 intrinsic _mm_mpsadbw_epu8
appears to lead to a mis-compilation when optimization (O1) is turned on.

I tried to come up with a minimal repro, as follows:

__m128i do_stuff(const __m128i* iVals) {
        const __m128i I0 = _mm_load_si128(&iVals[0]);
        const __m128i I1 = _mm_load_si128(&iVals[1]);
        const __m128i I2 = _mm_load_si128(&iVals[2]);

        const __m128i A = _mm_mpsadbw_epu8(I0, I2, 0);
        const __m128i B = _mm_add_epi8(I2, I1);
        const __m128i C = _mm_add_epi8(B, A);
        return C;
}

This function will run fine when compiled with -O0, but when using -O1 it gives
incorrect results. The -O1 assembly output is as follows:

do_stuff(long long __vector(2) const*):
        vmovdqa xmm0, xmmword ptr [rdi + 32]
        vmpsadbw        xmm1, xmm0, xmmword ptr [rdi], 0
        vpaddb  xmm0, xmm0, xmmword ptr [rdi + 16]
        vpaddb  xmm0, xmm0, xmm1
        ret

This is mostly correct, however the vmpsadbw opcode has had its operand order
flipped. It would be equivalent to having called
_mm_mpsadbw_epu8(I2, I0, 0)
instead. I believe this is because in the LLVM code, this op code is marked as
commutative. In llvm/include/llvm/IR/IntrinsicsX86.td, lines 791-796:

// Vector sum of absolute differences
let TargetPrefix = "x86" in {  // All intrinsics start with "llvm.x86.".
  def int_x86_sse41_mpsadbw         : GCCBuiltin<"__builtin_ia32_mpsadbw128">,
          Intrinsic<[llvm_v8i16_ty], [llvm_v16i8_ty, llvm_v16i8_ty,llvm_i8_ty],
                    [IntrNoMem, Commutative, ImmArg<ArgIndex<2>>]>;
}

This opcode is not commutative however. The byte-wise differences are
calculated using different indices for the first and second argument, meaning
that swapping the argument order leads to different results.

I first noticed this on Clang 12.0 for Windows, however I tested it on Godbolt
using the trunk Clang compiler and it still repros there. Here is a link to the
Godbolt code: <a href="https://godbolt.org/z/zs576oh39">https://godbolt.org/z/zs576oh39</a>

Cheers, and let me know if you need anything else (or if I've gotten something
wrong: this is my first bug filed)</pre>
        </div>
      </p>


      <hr>
      <span>You are receiving this mail because:</span>

      <ul>
          <li>You are on the CC list for the bug.</li>
      </ul>
    </body>
</html>