[llvm-bugs] [Bug 42870] New: Regression: LLVM9 trunk misoptimization of llvm.x86.sse.movmsk.ps for i586-unknown-linux-gnu

via llvm-bugs llvm-bugs at lists.llvm.org
Fri Aug 2 00:43:06 PDT 2019


https://bugs.llvm.org/show_bug.cgi?id=42870

            Bug ID: 42870
           Summary: Regression: LLVM9 trunk misoptimization of
                    llvm.x86.sse.movmsk.ps for i586-unknown-linux-gnu
           Product: libraries
           Version: trunk
          Hardware: PC
                OS: All
            Status: NEW
          Severity: release blocker
          Priority: P
         Component: Backend: X86
          Assignee: unassignedbugs at nondot.org
          Reporter: gonzalobg88 at gmail.com
                CC: craig.topper at gmail.com, llvm-bugs at lists.llvm.org,
                    llvm-dev at redking.me.uk, spatel+llvm at rotateright.com

LLVM8 opt and LLVM9 (trunk) opt, do optimize llvm.x86.sse.movmsk.ps
differently. LLVM8 leaves it as is, while LLVM9 transforms it to (see godbolt:
https://godbolt.org/z/35BloZ): 

define i32 @_ZN7example15_mm_movemask_ps17h9d7ca884d8f840c4E(<4 x float>*
noalias nocapture readonly dereferenceable(16) %a) unnamed_addr #0 {
start:
  %0 = bitcast <4 x float>* %a to <4 x i32>*
  %1 = load <4 x i32>, <4 x i32>* %0, align 16
  %2 = icmp slt <4 x i32> %1, zeroinitializer
  %3 = bitcast <4 x i1> %2 to i4
  %4 = zext i4 %3 to i32
  ret i32 %4
}

As a consequence, LLVM8 produces the following machine code for mm_movemask_ps
on i586-unknown-linux-gnu:

example::_mm_movemask_ps: # @example::_mm_movemask_ps
        mov     eax, dword ptr [esp + 4]
        movaps  xmm0, xmmword ptr [eax]
        movmskps        eax, xmm0
        ret

while LLVM9 produces:

example::_mm_movemask_ps: # @example::_mm_movemask_ps
        mov     eax, dword ptr [esp + 4]
        cmp     dword ptr [eax], 0
        sets    cl
        cmp     dword ptr [eax + 4], 0
        sets    dl
        add     dl, dl
        or      dl, cl
        cmp     dword ptr [eax + 8], 0
        sets    cl
        cmp     dword ptr [eax + 12], 0
        sets    al
        add     al, al
        or      al, cl
        shl     al, 2
        or      al, dl
        movzx   eax, al
        ret

-- 
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20190802/460825e6/attachment.html>


More information about the llvm-bugs mailing list