[llvm] r221009 - Correctly update dom-tree after loop vectorizer.
Eric Christopher
echristo at gmail.com
Mon Nov 3 18:53:18 PST 2014
On Sat Nov 01 2014 at 12:08:49 PM Michael Zolotukhin <mzolotukhin at apple.com>
wrote:
> Hi Eric,
>
> On Oct 31, 2014, at 5:58 PM, Eric Christopher <echristo at gmail.com> wrote:
>
> Few comments:
>
> Please remove unnecessary attributes from test cases.
>
> Sure, I’ll try to trim them. Though this testcase is already reduced from
> a bigger one with bugpoint, so I assumed it’s already ultimately minimized.
>
> Also, can you explain why this is the correct fix? Your commit message
> isn't enlightening and neither is the testcase.
>
> There was some explanation in another thread, but I’ll try to reproduce it
> with a little history here.
>
> Some time ago Chandler introduced a new flag ‘-extra-vectorizer-passes’,
> and I decided to run some benchmarks to see if there are any gains from it
> on my configuration. The extra passes are CSE, InstCombine, and SimplifyCFG
> invoked several times before and after the loop vectorizer pass. But with
> that flag several specs failed with “Broken function” error. I tried to
> find out, which pass is guilty and discover that after loop vectorizer
> don-tree info is incorrect. Fixing it then wasn’t hard.
>
> Now back to why this fix is correct (that would be almost copy-paste from
> the other thread and from the source code).
>
> Vectorizer generates the following CFG:
> [ ] <-- Back-edge taken count overflow check. <=== This is
> LoopBypassBlocks[0]
> / |
> / v
> | [ ] <-- vector loop bypass (may consist of multiple blocks).
> | / |
> | / v
> || [ ] <-- vector pre header.
> || |
> || v
> || [ ] \
> || [ ]_| <-- vector loop.
> || |
> | \ v
> | >[ ] <--- middle-block.
> | / |
> | / v
> -|- >[ ] <--- new preheader.
> | |
> | v
> | [ ] \
> | [ ]_| <-- old scalar loop to handle remainder.
> \ |
> \ v
> [ ] <-- exit block
>
> In the past there was no overflow checks, and immediate dominator of 'exit
> block' was ‘middle block’. But later the check was added, which it
> introduced another edge to 'exit block', and 'middle block' stopped
> dominating exit block.
>
>
Excellent, thanks for the explanation. :)
-eric
> The testcase itself isn’t obvious, but if you check it without the change,
> verifier will fail.
>
> Thanks,
> Michael
>
>
> -eric
>
> On Fri Oct 31 2014 at 3:42:29 PM Michael Zolotukhin <mzolotukhin at apple.com>
> wrote:
>
>> Author: mzolotukhin
>> Date: Fri Oct 31 17:28:03 2014
>> New Revision: 221009
>>
>> URL: http://llvm.org/viewvc/llvm-project?rev=221009&view=rev
>> Log:
>> Correctly update dom-tree after loop vectorizer.
>>
>> Added:
>> llvm/trunk/test/Transforms/LoopVectorize/incorrect-dom-info.ll
>> Modified:
>> llvm/trunk/lib/Transforms/Vectorize/LoopVectorize.cpp
>>
>> Modified: llvm/trunk/lib/Transforms/Vectorize/LoopVectorize.cpp
>> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/
>> Transforms/Vectorize/LoopVectorize.cpp?rev=221009&
>> r1=221008&r2=221009&view=diff
>> ============================================================
>> ==================
>> --- llvm/trunk/lib/Transforms/Vectorize/LoopVectorize.cpp (original)
>> +++ llvm/trunk/lib/Transforms/Vectorize/LoopVectorize.cpp Fri Oct 31
>> 17:28:03 2014
>> @@ -3428,7 +3428,7 @@ void InnerLoopVectorizer::updateAnalysis
>> DT->addNewBlock(LoopMiddleBlock, LoopBypassBlocks[1]);
>> DT->addNewBlock(LoopScalarPreHeader, LoopBypassBlocks[0]);
>> DT->changeImmediateDominator(LoopScalarBody, LoopScalarPreHeader);
>> - DT->changeImmediateDominator(LoopExitBlock, LoopMiddleBlock);
>> + DT->changeImmediateDominator(LoopExitBlock, LoopBypassBlocks[0]);
>>
>> DEBUG(DT->verifyDomTree());
>> }
>>
>> Added: llvm/trunk/test/Transforms/LoopVectorize/incorrect-dom-info.ll
>> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/
>> Transforms/LoopVectorize/incorrect-dom-info.ll?rev=221009&view=auto
>> ============================================================
>> ==================
>> --- llvm/trunk/test/Transforms/LoopVectorize/incorrect-dom-info.ll
>> (added)
>> +++ llvm/trunk/test/Transforms/LoopVectorize/incorrect-dom-info.ll Fri
>> Oct 31 17:28:03 2014
>> @@ -0,0 +1,142 @@
>> +; This test is based on one of benchmarks from SPEC2006. It exposes a
>> bug with
>> +; incorrect updating of the dom-tree.
>> +; RUN: opt < %s -loop-vectorize -verify-dom-info
>> +target datalayout = "e-m:o-i64:64-f80:128-n8:16:32:64-S128"
>> +
>> + at PL_utf8skip = external constant [0 x i8]
>> +
>> +; Function Attrs: nounwind ssp uwtable
>> +define void @Perl_pp_quotemeta() #0 {
>> + %len = alloca i64, align 8
>> + br i1 undef, label %2, label %1
>> +
>> +; <label>:1 ; preds = %0
>> + br label %3
>> +
>> +; <label>:2 ; preds = %0
>> + br label %3
>> +
>> +; <label>:3 ; preds = %2, %1
>> + br i1 undef, label %34, label %4
>> +
>> +; <label>:4 ; preds = %3
>> + br i1 undef, label %5, label %6
>> +
>> +; <label>:5 ; preds = %4
>> + br label %6
>> +
>> +; <label>:6 ; preds = %5, %4
>> + br i1 undef, label %7, label %8
>> +
>> +; <label>:7 ; preds = %6
>> + br label %8
>> +
>> +; <label>:8 ; preds = %7, %6
>> + br i1 undef, label %.preheader, label %9
>> +
>> +.preheader: ; preds = %9, %8
>> + br i1 undef, label %.loopexit, label %.lr.ph
>> +
>> +; <label>:9 ; preds = %8
>> + br i1 undef, label %thread-pre-split.preheader, label %.preheader
>> +
>> +thread-pre-split.preheader: ; preds = %9
>> + br i1 undef, label %thread-pre-split._crit_edge, label %.lr.ph21
>> +
>> +.thread-pre-split.loopexit_crit_edge: ; preds = %19
>> + %scevgep.sum = xor i64 %umax, -1
>> + %scevgep45 = getelementptr i8* %d.020, i64 %scevgep.sum
>> + br label %thread-pre-split.loopexit
>> +
>> +thread-pre-split.loopexit: ; preds = %11,
>> %.thread-pre-split.loopexit_crit_edge
>> + %d.1.lcssa = phi i8* [ %scevgep45, %.thread-pre-split.loopexit_crit_edge
>> ], [ %d.020, %11 ]
>> + br i1 false, label %thread-pre-split._crit_edge, label %.lr.ph21
>> +
>> +.lr.ph21: ; preds = %26,
>> %thread-pre-split.loopexit, %thread-pre-split.preheader
>> + %d.020 = phi i8* [ undef, %26 ], [ %d.1.lcssa,
>> %thread-pre-split.loopexit ], [ undef, %thread-pre-split.preheader ]
>> + %10 = phi i64 [ %28, %26 ], [ undef, %thread-pre-split.loopexit ], [
>> undef, %thread-pre-split.preheader ]
>> + br i1 undef, label %11, label %22
>> +
>> +; <label>:11 ; preds = %.lr.ph21
>> + %12 = getelementptr inbounds [0 x i8]* @PL_utf8skip, i64 0, i64 undef
>> + %13 = load i8* %12, align 1
>> + %14 = zext i8 %13 to i64
>> + %15 = icmp ugt i64 %14, %10
>> + %. = select i1 %15, i64 %10, i64 %14
>> + br i1 undef, label %thread-pre-split.loopexit, label %.lr.ph28
>> +
>> +.lr.ph28: ; preds = %11
>> + %16 = xor i64 %10, -1
>> + %17 = xor i64 %14, -1
>> + %18 = icmp ugt i64 %16, %17
>> + %umax = select i1 %18, i64 %16, i64 %17
>> + br label %19
>> +
>> +; <label>:19 ; preds = %19,
>> %.lr.ph28
>> + %ulen.126 = phi i64 [ %., %.lr.ph28 ], [ %20, %19 ]
>> + %20 = add i64 %ulen.126, -1
>> + %21 = icmp eq i64 %20, 0
>> + br i1 %21, label %.thread-pre-split.loopexit_crit_edge, label %19
>> +
>> +; <label>:22 ; preds = %.lr.ph21
>> + br i1 undef, label %26, label %23
>> +
>> +; <label>:23 ; preds = %22
>> + br i1 undef, label %26, label %24
>> +
>> +; <label>:24 ; preds = %23
>> + br i1 undef, label %26, label %25
>> +
>> +; <label>:25 ; preds = %24
>> + br label %26
>> +
>> +; <label>:26 ; preds = %25, %24,
>> %23, %22
>> + %27 = load i64* %len, align 8
>> + %28 = add i64 %27, -1
>> + br i1 undef, label %thread-pre-split._crit_edge, label %.lr.ph21
>> +
>> +thread-pre-split._crit_edge: ; preds = %26,
>> %thread-pre-split.loopexit, %thread-pre-split.preheader
>> + br label %.loopexit
>> +
>> +.lr.ph: ; preds = %33,
>> %.preheader
>> + br i1 undef, label %29, label %thread-pre-split5
>> +
>> +; <label>:29 ; preds = %.lr.ph
>> + br i1 undef, label %33, label %30
>> +
>> +; <label>:30 ; preds = %29
>> + br i1 undef, label %33, label %31
>> +
>> +thread-pre-split5: ; preds = %.lr.ph
>> + br i1 undef, label %33, label %31
>> +
>> +; <label>:31 ; preds =
>> %thread-pre-split5, %30
>> + br i1 undef, label %33, label %32
>> +
>> +; <label>:32 ; preds = %31
>> + br label %33
>> +
>> +; <label>:33 ; preds = %32, %31,
>> %thread-pre-split5, %30, %29
>> + br i1 undef, label %.loopexit, label %.lr.ph
>> +
>> +.loopexit: ; preds = %33,
>> %thread-pre-split._crit_edge, %.preheader
>> + br label %35
>> +
>> +; <label>:34 ; preds = %3
>> + br label %35
>> +
>> +; <label>:35 ; preds = %34,
>> %.loopexit
>> + br i1 undef, label %37, label %36
>> +
>> +; <label>:36 ; preds = %35
>> + br label %37
>> +
>> +; <label>:37 ; preds = %36, %35
>> + ret void
>> +}
>> +
>> +attributes #0 = { nounwind ssp uwtable "less-precise-fpmad"="false"
>> "no-frame-pointer-elim"="true" "no-frame-pointer-elim-non-leaf"
>> "no-infs-fp-math"="false" "no-nans-fp-math"="false"
>> "stack-protector-buffer-size"="8" "unsafe-fp-math"="false"
>> "use-soft-float"="false" }
>> +
>> +!llvm.ident = !{!0}
>> +
>> +!0 = metadata !{metadata !"clang version 3.6.0 "}
>>
>>
>> _______________________________________________
>> llvm-commits mailing list
>> llvm-commits at cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20141104/7c644254/attachment.html>
More information about the llvm-commits
mailing list