[llvm-bugs] [Bug 31879] New: vectorize repeated scalar ops that don't get put back into a vector

via llvm-bugs llvm-bugs at lists.llvm.org
Mon Feb 6 07:45:08 PST 2017


https://llvm.org/bugs/show_bug.cgi?id=31879

            Bug ID: 31879
           Summary: vectorize repeated scalar ops that don't get put back
                    into a vector
           Product: libraries
           Version: trunk
          Hardware: PC
                OS: All
            Status: NEW
          Severity: normal
          Priority: P
         Component: Transformation Utilities
          Assignee: unassignedbugs at nondot.org
          Reporter: spatel+llvm at rotateright.com
                CC: llvm-bugs at lists.llvm.org
    Classification: Unclassified

Forking this off from bug 31866:

; For vector <x0, x1>, return x0^2 + x1^2

define float @f(<2 x float> %x) {
  %x0 = extractelement <2 x float> %x, i32 0
  %x1 = extractelement <2 x float> %x, i32 1
  %x0x0 = fmul float %x0, %x0
  %x1x1 = fmul float %x1, %x1
  %add = fadd float %x0x0, %x1x1
  ret float %add
}

We should recognize patterns where the same scalar ops are applied to all
extracted elements in a vector and turn that into vector ops followed by
extraction:

define float @f(<2 x float> %x) {
  %xx = fmul <2 x float> %x, %x
  %x0x0 = extractelement <2 x float> %xx, i32 0
  %x1x1 = extractelement <2 x float> %xx, i32 1
  %add = fadd float %x0x0, %x1x1
  ret float %add
}

The SLP vectorizer already handles the case where we insert back into a vector,
so we can loosen the restriction and use the existing SLP logic to catch this
case?

define <2 x i8> @g(<2 x i8> %x) {
  %x0 = extractelement <2 x i8> %x, i32 0
  %x1 = extractelement <2 x i8> %x, i32 1
  %x0x0 = mul i8 %x0, %x0
  %x1x1 = mul i8 %x1, %x1
  %ins1 = insertelement <2 x i8> undef, i8 %x0x0, i32 0
  %ins2 = insertelement <2 x i8> %ins1, i8 %x1x1, i32 1
  ret <2 x i8> %ins2
}

$ ./opt -slp-vectorizer scalarizedmath.ll -S

define <2 x i8> @g(<2 x i8> %x) {
  %1 = mul <2 x i8> %x, %x
  %2 = extractelement <2 x i8> %1, i32 0
  %ret = insertelement <2 x i8> undef, i8 %2, i32 0
  %3 = extractelement <2 x i8> %1, i32 1
  %ret2 = insertelement <2 x i8> %ret, i8 %3, i32 1
  ret <2 x i8> %ret2
}

Instcombine can then clean up the useless inserts and extracts.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20170206/ff66db7a/attachment.html>


More information about the llvm-bugs mailing list