[LLVMbugs] [Bug 22555] New: SLP vectorizer fails to vectorize obvious candidates

Wed Feb 11 18:01:22 PST 2015

http://llvm.org/bugs/show_bug.cgi?id=22555

            Bug ID: 22555
           Summary: SLP vectorizer fails to vectorize obvious candidates
           Product: libraries
           Version: trunk
          Hardware: PC
                OS: All
            Status: NEW
          Severity: normal
          Priority: P
         Component: Scalar Optimizations
          Assignee: unassignedbugs at nondot.org
          Reporter: chandlerc at gmail.com
                CC: llvmbugs at cs.uiuc.edu
    Classification: Unclassified

Consider my test case:

% cat wtf_slp.ll                                   
target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"

define void @test(i64* %in, i64* %out) {
  %in.0 = getelementptr i64* %in, i32 0
  %in.1 = getelementptr i64* %in, i32 1
  %a = load i64* %in.0, align 16
  %b = load i64* %in.1, align 8
  %a.and = and i64 %a, -2
  %b.and = and i64 %b, -2
  %out.0 = getelementptr i64* %out, i32 0
  %out.1 = getelementptr i64* %out, i32 1
  store i64 %a.and, i64* %out.0, align 16
  store i64 %b.and, i64* %out.1, align 8
  ret void
}

The loads and stores are perfectly aligned. The operations are trivial to do in
the vector unit. I would love this to get vectorized. It doesn't:

% ./bin/opt -S -mcpu=corei7 -o - -slp-vectorizer wtf_slp.ll -debug
Args: ./bin/opt -S -mcpu=corei7 -o - -slp-vectorizer wtf_slp.ll -debug 

Features:+64bit,+sse2
CPU:corei7

Subtarget features: SSELevel 7, 3DNowLevel 0, 64bit 1

Features:+64bit,+sse2
CPU:corei7

Subtarget features: SSELevel 7, 3DNowLevel 0, 64bit 1
SLP: Analyzing blocks in test.
SLP: Found 2 stores to vectorize.
SLP: Analyzing a store chain of length 2.
SLP: Analyzing a store chain of length 2
SLP: Analyzing 2 stores at offset 0
SLP:  bundle:   store i64 %a.and, i64* %out.0, align 16
SLP:  initialize schedule region to   store i64 %a.and, i64* %out.0, align 16
SLP:  extend schedule region end to   store i64 %b.and, i64* %out.1, align 8
SLP: try schedule bundle [  store i64 %a.and, i64* %out.0, align 16;  store i64
%b.and, i64* %out.1, align 8] in block 
SLP:       update deps of [  store i64 %a.and, i64* %out.0, align 16;  store
i64 %b.and, i64* %out.1, align 8]
SLP:       update deps of /   store i64 %b.and, i64* %out.1, align 8
SLP: We are not able to schedule this bundle!
SLP:  cancel scheduling of [  store i64 %a.and, i64* %out.0, align 16;  store
i64 %b.and, i64* %out.1, align 8]
SLP: Calculating cost for tree of size 1.
SLP: Check whether the tree with height 1 is fully vectorizable .
SLP: Found cost=2147483647 for VF=2
; ModuleID = 'wtf_slp.ll'
target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"

define void @test(i64* %in, i64* %out) {
  %in.0 = getelementptr i64* %in, i32 0
  %in.1 = getelementptr i64* %in, i32 1
  %a = load i64* %in.0, align 16
  %b = load i64* %in.1, align 8
  %a.and = and i64 %a, -2
  %b.and = and i64 %b, -2
  %out.0 = getelementptr i64* %out, i32 0
  %out.1 = getelementptr i64* %out, i32 1
  store i64 %a.and, i64* %out.0, align 16
  store i64 %b.and, i64* %out.1, align 8
  ret void
}

Being unable to vectorize obvious code not only is likely hurting performance
and generated code size, it also makes the SLP vectorizer impossible to write
good test cases for when fixing bugs.

We really, really need this optimization pass to work in expected and obvious
ways. Right now, it appears to only handle very specific (and surprising)
patterns reliably.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20150212/de85bbf5/attachment.html>