[llvm-bugs] [Bug 41316] New: _mm_avg_epu16 preceded by a left shift results in poor code

via llvm-bugs llvm-bugs at lists.llvm.org
Sat Mar 30 02:23:01 PDT 2019


https://bugs.llvm.org/show_bug.cgi?id=41316

            Bug ID: 41316
           Summary: _mm_avg_epu16 preceded by a left shift results in poor
                    code
           Product: libraries
           Version: 8.0
          Hardware: PC
                OS: Linux
            Status: NEW
          Severity: enhancement
          Priority: P
         Component: Backend: X86
          Assignee: unassignedbugs at nondot.org
          Reporter: fabiang at radgametools.com
                CC: craig.topper at gmail.com, llvm-bugs at lists.llvm.org,
                    llvm-dev at redking.me.uk, spatel+llvm at rotateright.com

Reproducer:

-

#include <xmmintrin.h>

__m128i f(__m128i a, __m128i b)
{
  __m128i a_shifted = _mm_slli_epi16(a, 2);
  __m128i b_shifted = _mm_slli_epi16(b, 2);
  return _mm_avg_epu16(a_shifted, b_shifted);
}

-

with clang 5.0.0 or older, I get the expected:

-

f(long long __vector(2), long long __vector(2)):
        psllw   xmm0, 2
        psllw   xmm1, 2
        pavgw   xmm0, xmm1
        ret

-

but looks like sometime between 5.0.0 and 6.0.0, things broke and now the same
code results in:

-

.LCPI0_0:
        .short  1                       # 0x1
        .short  1                       # 0x1
        .short  1                       # 0x1
        .short  1                       # 0x1
        .short  1                       # 0x1
        .short  1                       # 0x1
        .short  1                       # 0x1
        .short  1                       # 0x1
f(long long __vector(2), long long __vector(2)):
        psllw   xmm0, 2
        psllw   xmm1, 2
        pxor    xmm2, xmm2
        movdqa  xmm3, xmm1
        punpckhwd       xmm3, xmm2
        punpcklwd       xmm1, xmm2
        por     xmm0, xmmword ptr [rip + .LCPI0_0]
        movdqa  xmm4, xmm0
        punpckhwd       xmm4, xmm2
        paddd   xmm4, xmm3
        punpcklwd       xmm0, xmm2
        paddd   xmm0, xmm1
        pslld   xmm4, 15
        psrad   xmm4, 16
        pslld   xmm0, 15
        psrad   xmm0, 16
        packssdw        xmm0, xmm4
        ret

-

everything after the "psllw xmm1, 2" appears to be a replacement expansion for
pavgw. This particular example was compiled with 8.0.0 but 6.x and 7.x are
similar.

If I just shift one of the inputs, I do get a pavgw; so the left shifts appear
to be important in some way. (Reduced from a more complex example.)

-- 
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20190330/ff6996d3/attachment-0001.html>


More information about the llvm-bugs mailing list