<html>

    <head>

      <base href="https://bugs.llvm.org/">

    </head>

    <body><table border="1" cellspacing="0" cellpadding="8">

        <tr>

          <th>Bug ID</th>

          <td><a class="bz_bug_link 

          bz_status_NEW "

   title="NEW - wrong codegen due to _mm_mpsadbw_epu8 intrinsic incorrectly marked as commutative"

   href="https://bugs.llvm.org/show_bug.cgi?id=51908">51908</a>

          </td>

        </tr>

        <tr>

          <th>Summary</th>

          <td>wrong codegen due to _mm_mpsadbw_epu8 intrinsic incorrectly marked as commutative

          </td>

        </tr>

        <tr>

          <th>Product</th>

          <td>libraries

          </td>

        </tr>

        <tr>

          <th>Version</th>

          <td>trunk

          </td>

        </tr>

        <tr>

          <th>Hardware</th>

          <td>PC

          </td>

        </tr>

        <tr>

          <th>OS</th>

          <td>All

          </td>

        </tr>

        <tr>

          <th>Status</th>

          <td>NEW

          </td>

        </tr>

        <tr>

          <th>Severity</th>

          <td>normal

          </td>

        </tr>

        <tr>

          <th>Priority</th>

          <td>P

          </td>

        </tr>

        <tr>

          <th>Component</th>

          <td>Backend: X86

          </td>

        </tr>

        <tr>

          <th>Assignee</th>

          <td>unassignedbugs@nondot.org

          </td>

        </tr>

        <tr>

          <th>Reporter</th>

          <td>benjsith@gmail.com

          </td>

        </tr>

        <tr>

          <th>CC</th>

          <td>craig.topper@gmail.com, llvm-bugs@lists.llvm.org, llvm-dev@redking.me.uk, pengfei.wang@intel.com, spatel+llvm@rotateright.com

          </td>

        </tr></table>

      <p>

        <div>

        <pre>I came across a case where using the Intel SSE4.1 intrinsic _mm_mpsadbw_epu8

appears to lead to a mis-compilation when optimization (O1) is turned on.

I tried to come up with a minimal repro, as follows:

__m128i do_stuff(const __m128i* iVals) {

        const __m128i I0 = _mm_load_si128(&iVals[0]);

        const __m128i I1 = _mm_load_si128(&iVals[1]);

        const __m128i I2 = _mm_load_si128(&iVals[2]);

        const __m128i A = _mm_mpsadbw_epu8(I0, I2, 0);

        const __m128i B = _mm_add_epi8(I2, I1);

        const __m128i C = _mm_add_epi8(B, A);

        return C;

}

This function will run fine when compiled with -O0, but when using -O1 it gives

incorrect results. The -O1 assembly output is as follows:

do_stuff(long long __vector(2) const*):

        vmovdqa xmm0, xmmword ptr [rdi + 32]

        vmpsadbw        xmm1, xmm0, xmmword ptr [rdi], 0

        vpaddb  xmm0, xmm0, xmmword ptr [rdi + 16]

        vpaddb  xmm0, xmm0, xmm1

        ret

This is mostly correct, however the vmpsadbw opcode has had its operand order

flipped. It would be equivalent to having called

_mm_mpsadbw_epu8(I2, I0, 0)

instead. I believe this is because in the LLVM code, this op code is marked as

commutative. In llvm/include/llvm/IR/IntrinsicsX86.td, lines 791-796:

// Vector sum of absolute differences

let TargetPrefix = "x86" in {  // All intrinsics start with "llvm.x86.".

  def int_x86_sse41_mpsadbw         : GCCBuiltin<"__builtin_ia32_mpsadbw128">,

          Intrinsic<[llvm_v8i16_ty], [llvm_v16i8_ty, llvm_v16i8_ty,llvm_i8_ty],

                    [IntrNoMem, Commutative, ImmArg<ArgIndex<2>>]>;

}

This opcode is not commutative however. The byte-wise differences are

calculated using different indices for the first and second argument, meaning

that swapping the argument order leads to different results.

I first noticed this on Clang 12.0 for Windows, however I tested it on Godbolt

using the trunk Clang compiler and it still repros there. Here is a link to the

Godbolt code: <a href="https://godbolt.org/z/zs576oh39">https://godbolt.org/z/zs576oh39</a>

Cheers, and let me know if you need anything else (or if I've gotten something

wrong: this is my first bug filed)</pre>

        </div>

      </p>

      <hr>

      <span>You are receiving this mail because:</span>

      <ul>

          <li>You are on the CC list for the bug.</li>

      </ul>

    </body>

</html>