[llvm] r221009 - Correctly update dom-tree after loop vectorizer.

Michael Zolotukhin mzolotukhin at apple.com
Sat Nov 1 12:08:40 PDT 2014


Hi Eric,
> On Oct 31, 2014, at 5:58 PM, Eric Christopher <echristo at gmail.com> wrote:
> 
> Few comments:
> 
> Please remove unnecessary attributes from test cases.
Sure, I’ll try to trim them. Though this testcase is already reduced from a bigger one with bugpoint, so I assumed it’s already ultimately minimized.

> Also, can you explain why this is the correct fix? Your commit message isn't enlightening and neither is the testcase.
There was some explanation in another thread, but I’ll try to reproduce it with a little history here.

Some time ago Chandler introduced a new flag ‘-extra-vectorizer-passes’, and I decided to run some benchmarks to see if there are any gains from it on my configuration. The extra passes are CSE, InstCombine, and SimplifyCFG invoked several times before and after the loop vectorizer pass. But with that flag several specs failed with “Broken function” error. I tried to find out, which pass is guilty and discover that after loop vectorizer don-tree info is incorrect. Fixing it then wasn’t hard.

Now back to why this fix is correct (that would be almost copy-paste from the other thread and from the source code).

Vectorizer generates the following CFG:
     [ ] <-- Back-edge taken count overflow check. <=== This is LoopBypassBlocks[0]
   /   |
  /    v
 |    [ ] <-- vector loop bypass (may consist of multiple blocks).
 |  /  |
 | /   v
 ||   [ ]     <-- vector pre header.
 ||    |
 ||    v
 ||   [  ] \
 ||   [  ]_|   <-- vector loop.
 ||    |
 | \   v
 |   >[ ]   <--- middle-block.
 |  /  |
 | /   v
 -|- >[ ]     <--- new preheader.
  |    |
  |    v
  |   [ ] \
  |   [ ]_|   <-- old scalar loop to handle remainder.
   \   |
    \  v
     [ ]  <-- exit block

In the past there was no overflow checks, and immediate dominator of 'exit block' was ‘middle block’. But later the check was added, which it introduced another edge to 'exit block', and 'middle block' stopped dominating exit block.

The testcase itself isn’t obvious, but if you check it without the change, verifier will fail.

Thanks,
Michael

> 
> -eric
> 
> On Fri Oct 31 2014 at 3:42:29 PM Michael Zolotukhin <mzolotukhin at apple.com <mailto:mzolotukhin at apple.com>> wrote:
> Author: mzolotukhin
> Date: Fri Oct 31 17:28:03 2014
> New Revision: 221009
> 
> URL: http://llvm.org/viewvc/llvm-project?rev=221009&view=rev <http://llvm.org/viewvc/llvm-project?rev=221009&view=rev>
> Log:
> Correctly update dom-tree after loop vectorizer.
> 
> Added:
>     llvm/trunk/test/Transforms/LoopVectorize/incorrect-dom-info.ll
> Modified:
>     llvm/trunk/lib/Transforms/Vectorize/LoopVectorize.cpp
> 
> Modified: llvm/trunk/lib/Transforms/Vectorize/LoopVectorize.cpp
> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Vectorize/LoopVectorize.cpp?rev=221009&r1=221008&r2=221009&view=diff <http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Vectorize/LoopVectorize.cpp?rev=221009&r1=221008&r2=221009&view=diff>
> ==============================================================================
> --- llvm/trunk/lib/Transforms/Vectorize/LoopVectorize.cpp (original)
> +++ llvm/trunk/lib/Transforms/Vectorize/LoopVectorize.cpp Fri Oct 31 17:28:03 2014
> @@ -3428,7 +3428,7 @@ void InnerLoopVectorizer::updateAnalysis
>    DT->addNewBlock(LoopMiddleBlock, LoopBypassBlocks[1]);
>    DT->addNewBlock(LoopScalarPreHeader, LoopBypassBlocks[0]);
>    DT->changeImmediateDominator(LoopScalarBody, LoopScalarPreHeader);
> -  DT->changeImmediateDominator(LoopExitBlock, LoopMiddleBlock);
> +  DT->changeImmediateDominator(LoopExitBlock, LoopBypassBlocks[0]);
> 
>    DEBUG(DT->verifyDomTree());
>  }
> 
> Added: llvm/trunk/test/Transforms/LoopVectorize/incorrect-dom-info.ll
> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/LoopVectorize/incorrect-dom-info.ll?rev=221009&view=auto <http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/LoopVectorize/incorrect-dom-info.ll?rev=221009&view=auto>
> ==============================================================================
> --- llvm/trunk/test/Transforms/LoopVectorize/incorrect-dom-info.ll (added)
> +++ llvm/trunk/test/Transforms/LoopVectorize/incorrect-dom-info.ll Fri Oct 31 17:28:03 2014
> @@ -0,0 +1,142 @@
> +; This test is based on one of benchmarks from SPEC2006. It exposes a bug with
> +; incorrect updating of the dom-tree.
> +; RUN: opt < %s  -loop-vectorize -verify-dom-info
> +target datalayout = "e-m:o-i64:64-f80:128-n8:16:32:64-S128"
> +
> + at PL_utf8skip = external constant [0 x i8]
> +
> +; Function Attrs: nounwind ssp uwtable
> +define void @Perl_pp_quotemeta() #0 {
> +  %len = alloca i64, align 8
> +  br i1 undef, label %2, label %1
> +
> +; <label>:1                                       ; preds = %0
> +  br label %3
> +
> +; <label>:2                                       ; preds = %0
> +  br label %3
> +
> +; <label>:3                                       ; preds = %2, %1
> +  br i1 undef, label %34, label %4
> +
> +; <label>:4                                       ; preds = %3
> +  br i1 undef, label %5, label %6
> +
> +; <label>:5                                       ; preds = %4
> +  br label %6
> +
> +; <label>:6                                       ; preds = %5, %4
> +  br i1 undef, label %7, label %8
> +
> +; <label>:7                                       ; preds = %6
> +  br label %8
> +
> +; <label>:8                                       ; preds = %7, %6
> +  br i1 undef, label %.preheader, label %9
> +
> +.preheader:                                       ; preds = %9, %8
> +  br i1 undef, label %.loopexit, label %.lr.ph <http://lr.ph/>
> +
> +; <label>:9                                       ; preds = %8
> +  br i1 undef, label %thread-pre-split.preheader, label %.preheader
> +
> +thread-pre-split.preheader:                       ; preds = %9
> +  br i1 undef, label %thread-pre-split._crit_edge, label %.lr.ph21
> +
> +.thread-pre-split.loopexit_crit_edge:             ; preds = %19
> +  %scevgep.sum = xor i64 %umax, -1
> +  %scevgep45 = getelementptr i8* %d.020, i64 %scevgep.sum
> +  br label %thread-pre-split.loopexit
> +
> +thread-pre-split.loopexit:                        ; preds = %11, %.thread-pre-split.loopexit_crit_edge
> +  %d.1.lcssa = phi i8* [ %scevgep45, %.thread-pre-split.loopexit_crit_edge ], [ %d.020, %11 ]
> +  br i1 false, label %thread-pre-split._crit_edge, label %.lr.ph21
> +
> +.lr.ph21:                                         ; preds = %26, %thread-pre-split.loopexit, %thread-pre-split.preheader
> +  %d.020 = phi i8* [ undef, %26 ], [ %d.1.lcssa, %thread-pre-split.loopexit ], [ undef, %thread-pre-split.preheader ]
> +  %10 = phi i64 [ %28, %26 ], [ undef, %thread-pre-split.loopexit ], [ undef, %thread-pre-split.preheader ]
> +  br i1 undef, label %11, label %22
> +
> +; <label>:11                                      ; preds = %.lr.ph21
> +  %12 = getelementptr inbounds [0 x i8]* @PL_utf8skip, i64 0, i64 undef
> +  %13 = load i8* %12, align 1
> +  %14 = zext i8 %13 to i64
> +  %15 = icmp ugt i64 %14, %10
> +  %. = select i1 %15, i64 %10, i64 %14
> +  br i1 undef, label %thread-pre-split.loopexit, label %.lr.ph28
> +
> +.lr.ph28:                                         ; preds = %11
> +  %16 = xor i64 %10, -1
> +  %17 = xor i64 %14, -1
> +  %18 = icmp ugt i64 %16, %17
> +  %umax = select i1 %18, i64 %16, i64 %17
> +  br label %19
> +
> +; <label>:19                                      ; preds = %19, %.lr.ph28
> +  %ulen.126 = phi i64 [ %., %.lr.ph28 ], [ %20, %19 ]
> +  %20 = add i64 %ulen.126, -1
> +  %21 = icmp eq i64 %20, 0
> +  br i1 %21, label %.thread-pre-split.loopexit_crit_edge, label %19
> +
> +; <label>:22                                      ; preds = %.lr.ph21
> +  br i1 undef, label %26, label %23
> +
> +; <label>:23                                      ; preds = %22
> +  br i1 undef, label %26, label %24
> +
> +; <label>:24                                      ; preds = %23
> +  br i1 undef, label %26, label %25
> +
> +; <label>:25                                      ; preds = %24
> +  br label %26
> +
> +; <label>:26                                      ; preds = %25, %24, %23, %22
> +  %27 = load i64* %len, align 8
> +  %28 = add i64 %27, -1
> +  br i1 undef, label %thread-pre-split._crit_edge, label %.lr.ph21
> +
> +thread-pre-split._crit_edge:                      ; preds = %26, %thread-pre-split.loopexit, %thread-pre-split.preheader
> +  br label %.loopexit
> +
> +.lr.ph <http://lr.ph/>:                                           ; preds = %33, %.preheader
> +  br i1 undef, label %29, label %thread-pre-split5
> +
> +; <label>:29                                      ; preds = %.lr.ph <http://lr.ph/>
> +  br i1 undef, label %33, label %30
> +
> +; <label>:30                                      ; preds = %29
> +  br i1 undef, label %33, label %31
> +
> +thread-pre-split5:                                ; preds = %.lr.ph <http://lr.ph/>
> +  br i1 undef, label %33, label %31
> +
> +; <label>:31                                      ; preds = %thread-pre-split5, %30
> +  br i1 undef, label %33, label %32
> +
> +; <label>:32                                      ; preds = %31
> +  br label %33
> +
> +; <label>:33                                      ; preds = %32, %31, %thread-pre-split5, %30, %29
> +  br i1 undef, label %.loopexit, label %.lr.ph <http://lr.ph/>
> +
> +.loopexit:                                        ; preds = %33, %thread-pre-split._crit_edge, %.preheader
> +  br label %35
> +
> +; <label>:34                                      ; preds = %3
> +  br label %35
> +
> +; <label>:35                                      ; preds = %34, %.loopexit
> +  br i1 undef, label %37, label %36
> +
> +; <label>:36                                      ; preds = %35
> +  br label %37
> +
> +; <label>:37                                      ; preds = %36, %35
> +  ret void
> +}
> +
> +attributes #0 = { nounwind ssp uwtable "less-precise-fpmad"="false" "no-frame-pointer-elim"="true" "no-frame-pointer-elim-non-leaf" "no-infs-fp-math"="false" "no-nans-fp-math"="false" "stack-protector-buffer-size"="8" "unsafe-fp-math"="false" "use-soft-float"="false" }
> +
> +!llvm.ident = !{!0}
> +
> +!0 = metadata !{metadata !"clang version 3.6.0 "}
> 
> 
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu <mailto:llvm-commits at cs.uiuc.edu>
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits <http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20141101/deb92024/attachment.html>


More information about the llvm-commits mailing list