[llvm-bugs] [Bug 47677] New: VLD2 shuffle patterns gets broken apart by instcombine

Tue Sep 29 04:10:35 PDT 2020

https://bugs.llvm.org/show_bug.cgi?id=47677

            Bug ID: 47677
           Summary: VLD2 shuffle patterns gets broken apart by instcombine
           Product: libraries
           Version: trunk
          Hardware: PC
                OS: Windows NT
            Status: NEW
          Severity: enhancement
          Priority: P
         Component: Scalar Optimizations
          Assignee: unassignedbugs at nondot.org
          Reporter: david.green at arm.com
                CC: llvm-bugs at lists.llvm.org

Give code that the vectorizer chooses to create a VLD2 interleaving group for:
https://godbolt.org/z/P6E88r

We generate a loop that does:
  ; VLD2 group
  %wide.vec = load <8 x float>, <8 x float>* %2, align 4, !tbaa !11
  %strided.vec = shufflevector <8 x float> %wide.vec, <8 x float> undef, <4 x
i32> <i32 0, i32 2, i32 4, i32 6>
  %strided.vec19 = shufflevector <8 x float> %wide.vec, <8 x float> undef, <4 x
i32> <i32 1, i32 3, i32 5, i32 7>
  ; Operations
  %3 = fmul fast <4 x float> %strided.vec, %strided.vec
  %4 = fmul fast <4 x float> %strided.vec19, %strided.vec19
  %5 = fadd fast <4 x float> %4, %3
  ; store
  store <4 x float> %5, <4 x float>* %7, align 4, !tbaa !11

Currently in instcombine's foldVectorBinop, it will sink the shuffle's past the
fmul (as each side has an equal shuffle mask). This means we end up with
regular non-interleaving loads and potentially expensive shuffles in the middle
of the fmul and fadd.

On NEON for aarch64 and arm this will create zip instructions. For MVE where
the zip/unz are not present we fall back to even more expensive registry moved.

Similar things can happen in other cases:
https://godbolt.org/z/WKnPE9

-- 
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20200929/e588e093/attachment.html>