[llvm-bugs] [Bug 35282] New: Functional bug: LoopVectorizer must update the nuw/nsw flags
via llvm-bugs
llvm-bugs at lists.llvm.org
Sun Nov 12 22:37:25 PST 2017
https://bugs.llvm.org/show_bug.cgi?id=35282
Bug ID: 35282
Summary: Functional bug: LoopVectorizer must update the nuw/nsw
flags
Product: libraries
Version: trunk
Hardware: PC
OS: Linux
Status: NEW
Severity: normal
Priority: P
Component: Loop Optimizer
Assignee: unassignedbugs at nondot.org
Reporter: serguei.katkov at azul.com
CC: llvm-bugs at lists.llvm.org
Let's consider the following simple loop:
target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128-ni:1"
target triple = "x86_64-unknown-linux-gnu"
define i64 @test(i32 %a) {
entry:
%factor = mul nsw i32 %a, -5
br label %loop
loop:
%b = phi i32 [ -35, %entry ], [ %e, %loop ]
%c = phi i32 [ 10, %entry ], [ %f, %loop ]
%d = sub nuw nsw i32 %b, %a
%e = add nsw i32 %factor, %d
%f = add nuw nsw i32 %c, 1
%cmp = icmp ugt i32 %c, 226
br i1 %cmp, label %done, label %loop
done:
%.lcssa = phi i32 [ %e, %loop ]
%g = sext i32 %.lcssa to i64
ret i64 %g
}
Loop Vectorizer transforms (opt --loop-vectorize test.ll -S) it to
...
vector.body: ; preds = %vector.body,
%vector.ph
%index = phi i32 [ 0, %vector.ph ], [ %index.next, %vector.body ]
%vec.phi = phi <4 x i32> [ <i32 -35, i32 0, i32 0, i32 0>, %vector.ph ], [
%4, %vector.body ]
%vec.phi1 = phi <4 x i32> [ zeroinitializer, %vector.ph ], [ %5, %vector.body
]
%offset.idx = add i32 10, %index
%broadcast.splatinsert = insertelement <4 x i32> undef, i32 %offset.idx, i32
0
%broadcast.splat = shufflevector <4 x i32> %broadcast.splatinsert, <4 x i32>
undef, <4 x i32> zeroinitializer
%induction = add <4 x i32> %broadcast.splat, <i32 0, i32 1, i32 2, i32 3>
%induction2 = add <4 x i32> %broadcast.splat, <i32 4, i32 5, i32 6, i32 7>
%0 = add i32 %offset.idx, 0
%1 = add i32 %offset.idx, 4
%2 = sub nuw nsw <4 x i32> %vec.phi, %broadcast.splat4
%3 = sub nuw nsw <4 x i32> %vec.phi1, %broadcast.splat6
%4 = add nsw <4 x i32> %broadcast.splat8, %2
%5 = add nsw <4 x i32> %broadcast.splat10, %3
%index.next = add i32 %index, 8
%6 = icmp eq i32 %index.next, 216
br i1 %6, label %middle.block, label %vector.body, !llvm.loop !0
...
The problematic instruction is
%3 = sub nuw nsw <4 x i32> %vec.phi1, %broadcast.splat6
%vec.phi1 on first iteration is zero and "0 - value" with nuw actually means
that it is zero as well.
So after, for example, full loop unrolling LLVM is allowed to eliminate this
instruction on first iteration but it is invalid.
So vectorizer should clean or be smarter with nuw/nsw flags.
Could someone working on LLVM vectorizer take a look at this?
--
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20171113/082e54a2/attachment.html>
More information about the llvm-bugs
mailing list