[llvm-bugs] [Bug 25146] New: Suboptimal across lane float min/max reduction

Mon Oct 12 06:03:39 PDT 2015

https://llvm.org/bugs/show_bug.cgi?id=25146

            Bug ID: 25146
           Summary: Suboptimal across lane float min/max reduction
           Product: libraries
           Version: trunk
          Hardware: PC
                OS: Linux
            Status: NEW
          Severity: normal
          Priority: P
         Component: Backend: AArch64
          Assignee: unassignedbugs at nondot.org
          Reporter: charlesturner7c5 at gmail.com
                CC: llvm-bugs at lists.llvm.org
    Classification: Unclassified

Created attachment 15056
  --> https://llvm.org/bugs/attachment.cgi?id=15056&action=edit
LLVM function showing missed opportunity.

For the attached test case, I'm seeing codegen with some room for improvement.

oversized_f_fmaxvec:                    // @oversized_f_fmaxvec
        .cfi_startproc
// BB#0:
        ldp     q0, q1, [x0, #32]
        ldp             q3, q2, [x0]
        fmaxnm  v4.4s, v0.4s, v0.4s
        fmaxnm  v5.4s, v1.4s, v0.4s
        fmaxnm  v1.4s, v2.4s, v1.4s
        fmaxnm  v0.4s, v3.4s, v0.4s
        fmaxnm  v0.4s, v0.4s, v1.4s
        fmaxnm  v1.4s, v1.4s, v0.4s
        fmaxnm  v2.4s, v5.4s, v0.4s
        fmaxnm  v3.4s, v4.4s, v0.4s
        ext     v4.16b, v0.16b, v0.16b, #8
        fmaxnm  v3.4s, v3.4s, v0.4s
        fmaxnm  v2.4s, v2.4s, v0.4s
        fmaxnm  v1.4s, v1.4s, v0.4s
        fmaxnm  v0.4s, v0.4s, v4.4s
        fcmge   v1.4s, v1.4s, v0.4s
        fcmge   v2.4s, v2.4s, v0.4s
        fcmge   v3.4s, v3.4s, v0.4s
        dup     v4.4s, v0.s[1]
        xtn     v1.4h, v1.4s
        xtn     v2.4h, v2.4s
        xtn     v3.4h, v3.4s
        fcmge   v4.4s, v0.4s, v4.4s
        uzp1    v2.8b, v3.8b, v2.8b
        xtn     v3.4h, v4.4s
        shl     v2.8b, v2.8b, #7
        uzp1    v1.8b, v3.8b, v1.8b
        sshr    v2.8b, v2.8b, #7
        shl     v1.8b, v1.8b, #7
        sshr    v1.8b, v1.8b, #7
        ins     v1.d[1], v2.d[0]
        shl     v1.16b, v1.16b, #7
        sshr    v1.16b, v1.16b, #7
        umov    w8, v1.b[0]
        mov     s1, v0.s[1]
        tst      w8, #0x1
        fcsel   s0, s0, s1, ne
        ret

I expect to see the FMAXNMV used here instead. I think r249834 was moving in
this direction.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20151012/05b844e5/attachment-0001.html>