[llvm-bugs] [Bug 37502] New: _mm_set_ps is lowered badly with sse4

via llvm-bugs llvm-bugs at lists.llvm.org
Thu May 17 07:47:07 PDT 2018


https://bugs.llvm.org/show_bug.cgi?id=37502

            Bug ID: 37502
           Summary: _mm_set_ps is lowered badly with sse4
           Product: libraries
           Version: trunk
          Hardware: PC
                OS: All
            Status: NEW
          Severity: enhancement
          Priority: P
         Component: Backend: X86
          Assignee: unassignedbugs at nondot.org
          Reporter: jmuizelaar at mozilla.com
                CC: llvm-bugs at lists.llvm.org

__m128 f(float aYScale, float aXScale) {
   return _mm_set_ps(aYScale, aXScale, aYScale, aXScale);
}

With -mssse3 this compiles to:

        unpcklps        %xmm0, %xmm1    # xmm1 =
xmm1[0],xmm0[0],xmm1[1],xmm0[1]
        movddup %xmm1, %xmm0            # xmm0 = xmm1[0,0]

with -mssse4 this compiles to:
        movaps  %xmm1, %xmm2
        insertps        $16, %xmm0, %xmm2 # xmm2 = xmm2[0],xmm0[0],xmm2[2,3]
        insertps        $32, %xmm1, %xmm2 # xmm2 = xmm2[0,1],xmm1[0],xmm2[3]
        insertps        $48, %xmm0, %xmm2 # xmm2 = xmm2[0,1,2],xmm0[0]
        movaps  %xmm2, %xmm0

llvm-mca -mcpu=haswell agrees that the ssse3 version is better:

Iterations:     1
Instructions:   2
Total Cycles:   5
Dispatch Width: 4
IPC:            0.40

vs

Iterations:     1
Instructions:   5
Total Cycles:   8
Dispatch Width: 4
IPC:            0.62

-- 
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20180517/daf45130/attachment.html>


More information about the llvm-bugs mailing list