[llvm-bugs] [Bug 40310] New: [SLP] slp-vectorizer-hor miscompiles unrolled &
via llvm-bugs
llvm-bugs at lists.llvm.org
Mon Jan 14 14:21:46 PST 2019
https://bugs.llvm.org/show_bug.cgi?id=40310
Bug ID: 40310
Summary: [SLP] slp-vectorizer-hor miscompiles unrolled &
Product: libraries
Version: trunk
Hardware: PC
OS: Linux
Status: NEW
Severity: enhancement
Priority: P
Component: Loop Optimizer
Assignee: unassignedbugs at nondot.org
Reporter: fedor.v.sergeev at gmail.com
CC: llvm-bugs at lists.llvm.org
Created attachment 21322
--> https://bugs.llvm.org/attachment.cgi?id=21322&action=edit
IR that demonstrates slp-vectorizer miscompile
Attached/cited IR is a manually reduced result of optimizing a Java code:
---
void mainTest(int param, int values[], int len) {
int mask = 31;
for (int i = 0; i < len; i++) {
values[i] = param--;
for (int ix = 0; ix < 16; ix++)
param &= mask++;
}
}
---
] cat before-slp.ll
target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128-ni:1"
target triple = "x86_64-unknown-linux-gnu"
define void @mainTest(i32 %param, i8* align 8 %vals, i32 %len) {
bci_15.preheader:
br label %bci_15
bci_15:
%local_0_ = phi i32 [ %v43, %bci_15 ], [ %param, %bci_15.preheader ]
%local_4_ = phi i32 [ %v44, %bci_15 ], [ 31, %bci_15.preheader ]
%v12 = add i32 %local_0_, -1
%v13 = add i32 %local_4_, 1
%v14 = and i32 %local_4_, %v12
%v15 = add i32 %local_4_, 2
%v16 = and i32 %v13, %v14
%v17 = add i32 %local_4_, 3
%v18 = and i32 %v15, %v16
%v19 = add i32 %local_4_, 4
%v20 = and i32 %v17, %v18
%v21 = add i32 %local_4_, 5
%v22 = and i32 %v19, %v20
%v23 = add i32 %local_4_, 6
%v24 = and i32 %v21, %v22
%v25 = add i32 %local_4_, 7
%v26 = and i32 %v23, %v24
%v27 = add i32 %local_4_, 8
%v28 = and i32 %v25, %v26
%v29 = add i32 %local_4_, 9
%v30 = and i32 %v27, %v28
%v31 = add i32 %local_4_, 10
%v32 = and i32 %v29, %v30
%v33 = add i32 %local_4_, 11
%v34 = and i32 %v31, %v32
%v35 = add i32 %local_4_, 12
%v36 = and i32 %v33, %v34
%v37 = add i32 %local_4_, 13
%v38 = and i32 %v35, %v36
%v39 = add i32 %local_4_, 14
%v40 = and i32 %v37, %v38
%v41 = add i32 %local_4_, 15
%v42 = and i32 %v39, %v40
%v43 = and i32 %v41, %v42
%v44 = add i32 %local_4_, 16
br i1 true, label %bci_15, label %loopexit
loopexit:
ret void
}
]
These add's and and's are a result of unrolled inner loop that does param =
&mask. Unfortunately, when horizontal reduction is applied it starts spitting
out undef-s:
] bin/opt -slp-vectorizer before-slp.ll -S | grep -c undef
23
]
And these undefs essentially lead to 'param' calculation to be undef'ed (and
become 0 in my case). This is a miscompile, and is rather subtle and hard to
track :(
If horizontal reduction is disabled there are no undefs and the code is
semantically correct. (also my original Java test starts passing, so I'm pretty
sure that my miscompile is caused by this).
] bin/opt -slp-vectorizer before-slp.ll -slp-vectorize-hor=false -S | grep -c
undef
0
]
--
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20190114/7ade8a71/attachment-0001.html>
More information about the llvm-bugs
mailing list