[llvm-bugs] [Bug 43828] New: nowrap flags are not always correct after vectorization
via llvm-bugs
llvm-bugs at lists.llvm.org
Mon Oct 28 06:41:28 PDT 2019
https://bugs.llvm.org/show_bug.cgi?id=43828
Bug ID: 43828
Summary: nowrap flags are not always correct after
vectorization
Product: libraries
Version: trunk
Hardware: PC
OS: All
Status: NEW
Severity: normal
Priority: P
Component: Loop Optimizer
Assignee: unassignedbugs at nondot.org
Reporter: dantrushin at gmail.com
CC: llvm-bugs at lists.llvm.org
Created attachment 22738
--> https://bugs.llvm.org/attachment.cgi?id=22738&action=edit
Test to demonstrate wrong vectorizer behavior
When widening instructions loop vectorize always copies IR flags (including
nowrap) from scalar instruction to new vector instruction.
But this is not always correct. Consider subtract reduction loop which
is vectorized and interleaved.
outer_loop:
%local_4 = phi i32 [ 2, %entry ], [ %4, %outer_tail]
br label %inner_loop
inner_loop:
%local_2 = phi i32 [ 0, %outer_loop ], [ %1, %inner_loop ]
%local_3 = phi i32 [ -104, %outer_loop ], [ %0, %inner_loop ]
%0 = sub nuw nsw i32 %local_3, %local_4
%1 = add nuw nsw i32 %local_2, 1
%2 = icmp ugt i32 %local_2, 126
br i1 %2, label %outer_tail, label %inner_loop
outer_tail:
%3 = phi i32 [ %0, %inner_loop ]
%4 = add i32 %local_4, 1
%5 = icmp slt i32 %4, 6
br i1 %5, label %outer_loop, label %exit
Note nuw/nsw flags on sub instruction - they're correct for scalar code
after vectorization it becomes:
vector.ph: ; preds = %outer_loop
%broadcast.splatinsert3 = insertelement <4 x i32> undef, i32 %local_4, i32 0
%broadcast.splat4 = shufflevector <4 x i32> %broadcast.splatinsert3, <4 x
i32> undef, <4 x i32> zeroinitializer
br label %vector.body
vector.body: ; preds = %vector.body, %vector.ph
%index = phi i32 [ 0, %vector.ph ], [ %index.next, %vector.body ]
%vec.phi = phi <4 x i32> [ <i32 -104, i32 0, i32 0, i32 0>, %vector.ph ], [
%2, %vector.body ]
%vec.phi2 = phi <4 x i32> [ zeroinitializer, %vector.ph ], [ %3, %vector.body
]
%0 = sub nuw nsw <4 x i32> %vec.phi, %broadcast.splat4
%1 = sub nuw nsw <4 x i32> %vec.phi2, %broadcast.splat4
%index.next = add i32 %index, 8
%2 = icmp eq i32 %index.next, 128
br i1 %2, label %middle.block, label %vector.body, !llvm.loop !0
Note that %1 sub still has nuw flag set, but it is incorrect now.
Due to this flag, later optimizations remove second sub instruction
[ (0 - x)<nuw> -> 0 ] which results in incorrect code
Simple testcase is attached (unrolling vectorized loop makes it clearly
visible)
--
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20191028/d1d2e40b/attachment.html>
More information about the llvm-bugs
mailing list