[PATCH] D16382: Add LoopSimplifyCFG pass

escha via llvm-commits llvm-commits at lists.llvm.org
Wed Jan 20 16:55:26 PST 2016


escha created this revision.
escha added reviewers: chandlerc, resistor, mzolotukhin, hfinkel.
escha added a subscriber: llvm-commits.
escha set the repository for this revision to rL LLVM.
Herald added a subscriber: sanjoy.

super short version: this is a loop pass that does trivial CFG simplification on a loop, as requested by Chandler as the solution to the real problem below. it isn't used in the pass manager yet. right now it only merges consecutive blocks; it doesn't do anything fancier, but could in the future.

Details:

This IR has a perfectly reasonable nested loop that rotate -> unroll does not actually unroll all the way:

define i32 @foo(i32* %P, i64 *%Q) {
entry:
  br label %outer

outer:
  %y.2 = phi i32 [ 0, %entry ], [ %y.inc2, %outer.latch2 ]
  br label %inner

inner:
  %x.2 = phi i32 [ 0, %outer ], [ %inc2, %inner ]
  %inc2 = add nsw i32 %x.2, 1
  %exitcond2 = icmp eq i32 %inc2, 3
  store i32 %x.2, i32* %P
  br i1 %exitcond2, label %outer.latch, label %inner

outer.latch:
  %y.inc2 = add nsw i32 %y.2, 1
  %exitcond.outer = icmp eq i32 %y.inc2, 3
  store i32 %y.2, i32* %P
  br i1 %exitcond.outer, label %exit, label %outer.latch2

outer.latch2:
  %t = sext i32 %y.inc2 to i64 
  store i64 %t, i64* %Q
  br label %outer

exit:
  ret i32 0
}

This is because after unrolling the inner loop, the outer loop has two header blocks, which while valid and canonical in terms of LCSSA, is not what loop rotate understands. The hack solution is to run rotate -> unroll -> simplifycfg-> rotate -> unroll, which is bad. The slightly less hack is to put this simplification into LoopSimplify, which Chandler argues is a bad idea because LoopSimplify specifically simplifies in ways that maintain the canonical form, and nothing else (and we may want to run LoopSimplifyCFG in other places for other reasons). Chandler suggests that the most general solution is just to add a much-needed LoopSimplifyCFG, which I did.

The problem with using this right now is that in practice, you need a pipeline that looks like this to make use of it:

LoopPassManager:
- Loop SimplifyCFG
- Loop Rotate
- Loop Unroll

And currently the PassManagerBuilder causes the LPMs to be split up due to analyses that are required being inserted in between (which chandler is working on). However, with a shim to require the associated analyses, this does work in practice in our pipeline out of tree, and a test just for this pass is included.

This is important to us because we have critical benchmark code that takes a form similar to this and similarly fails to unroll.

Repository:
  rL LLVM

http://reviews.llvm.org/D16382

Files:
  include/llvm/InitializePasses.h
  include/llvm/LinkAllPasses.h
  include/llvm/Transforms/Scalar.h
  lib/Transforms/Scalar/CMakeLists.txt
  lib/Transforms/Scalar/LoopSimplifyCFG.cpp
  lib/Transforms/Scalar/Scalar.cpp
  test/Transforms/LoopSimplifyCFG/merge-header.ll

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D16382.45465.patch
Type: text/x-patch
Size: 8254 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20160121/365844b4/attachment-0001.bin>


More information about the llvm-commits mailing list