[PATCH][instcombine] Remove preservesCFG from instcombine

Mark Heffernan meheff at google.com
Mon Nov 3 16:24:21 PST 2014


On Mon, Nov 3, 2014 at 3:44 PM, Andrew Trick <atrick at apple.com> wrote:

>
> > On Nov 3, 2014, at 2:49 PM, Hal Finkel <hfinkel at anl.gov> wrote:
> >
> > Can we still have InstCombine preserve the DomTree?
>
> InstCombine *really* should preserve DomTree and LoopInfo. But one of
> those may be causing the missed optimization. Maybe
> DominatorTree::isReachableFromEntry needs to be updated? That could in turn
> affect LoopInfo or LCSSA.
>

I think DomTree is preserved.  In fact, there is a test which fails if I
*don't* preserve DomTree:
test/Transforms/InstCombine/assume-loop-align.ll.  Hal, you added this test
recently.  This test breaking seems like a bug somewhere.  Any idea why
this test would break like this?

The missed optimization I was seeing (dead loop not deleted) was due to
loop simplify not running.  After my patch here's the chain of events
resulting in the loop being deleted:
1) InstCombine runs replacing latch branch predicate with undef ("bra
undef")
2) Loop simplify turns this into always exit at the latch: either "bra
true" or "bra false".  Without my patch loop simplify never runs.
3) SCEV can now reason about the loop and determine a trip count which is
trivially one.
4) loop deletion deletes loop.  (loop-deletion requires a trip count to
avoid removing infinite loops)

It looks like the only thing not being preserved here is LoopSimplify.  So
I'll add domtree and loopinfo preserved to the patch.

Mark


> I say, if fixing that is out-of-scope for this change, please file a PR
> for adding an API that updates DomTree and LoopInfo when a branch condition
> changes (to undef).
>
> -Andy
>
> >
> > -Hal
> >
> > ----- Original Message -----
> >> From: "Mark Heffernan" <meheff at google.com>
> >> To: "Andrew Trick" <atrick at apple.com>
> >> Cc: "llvm commits" <llvm-commits at cs.uiuc.edu>
> >> Sent: Monday, November 3, 2014 4:36:01 PM
> >> Subject: Re: [PATCH][instcombine] Remove preservesCFG from instcombine
> >>
> >>
> >>
> >>
> >> Here's the -debug-pass=Structure output from an O2 build of the repro
> >> code I posted earlier. The '+' lines are new with the change:
> >>
> >>
> >>
> >> Pass Arguments: -datalayout -notti -basictti -x86tti -no-aa -tbaa
> >> -scoped-noalias -assumption-tracker -targetlibinfo -basicaa -verify
> >> -simplifycfg -domtree -sroa -early-cse -lower-expect
> >> Data Layout
> >> No target information
> >> Target independent code generator's TTI
> >> X86 Target Transform Info
> >> No Alias Analysis (always returns 'may' alias)
> >> Type-Based Alias Analysis
> >> Scoped NoAlias Alias Analysis
> >> Assumption Tracker
> >> Target Library Information
> >> Basic Alias Analysis (stateless AA impl)
> >> FunctionPass Manager
> >> Module Verifier
> >> Simplify the CFG
> >> Dominator Tree Construction
> >> SROA
> >> Early CSE
> >> Lower 'expect' Intrinsics
> >> -Pass Arguments: -targetlibinfo -datalayout -notti -basictti -x86tti
> >> -no-aa -tbaa -scoped-noalias -assumption-tracker -basicaa -verify-di
> >> -ipsccp -globalopt -deadargelim -instcombine -simplifycfg -basiccg
> >> -prune-eh -inline-cost -inline -functionattrs -sroa -domtree
> >> -early-cse -lazy-value-info -jump-threading -correlated-propagation
> >> -simplifycfg -instcombine -tailcallelim -simplifycfg -reassociate
> >> -domtree -loops -loop-simplify -lcssa -loop-rotate -licm
> >> -loop-unswitch -instcombine -scalar-evolution -lcssa -indvars
> >> -loop-idiom -loop-deletion -function_tti -loop-unroll -memdep
> >> -mldst-motion -domtree -memdep -gvn -memdep -memcpyopt -sccp
> >> -instcombine -lazy-value-info -jump-threading
> >> -correlated-propagation -domtree -memdep -dse -adce -simplifycfg
> >> -instcombine -barrier -domtree -loops -loop-simplify -lcssa
> >> -branch-prob -block-freq -scalar-evolution -loop-vectorize
> >> -instcombine -scalar-evolution -slp-vectorizer -simplifycfg
> >> -instcombine -domtree -loops -loop-simplify -lcssa -scalar-evolution
> >> -function_tti -loop-unroll -alignment-from-assumptions
> >> -strip-dead-prototypes -globaldce -constmerge -verify -verify-di
> >> -print-module
> >> +Pass Arguments: -targetlibinfo -datalayout -notti -basictti -x86tti
> >> -no-aa -tbaa -scoped-noalias -assumption-tracker -basicaa -verify-di
> >> -ipsccp -globalopt -deadargelim -instcombine -simplifycfg -basiccg
> >> -prune-eh -inline-cost -inline -functionattrs -sroa -domtree
> >> -early-cse -lazy-value-info -jump-threading -correlated-propagation
> >> -simplifycfg -instcombine -tailcallelim -simplifycfg -reassociate
> >> -domtree -loops -loop-simplify -lcssa -loop-rotate -licm
> >> -loop-unswitch -instcombine -domtree -loops -scalar-evolution
> >> -loop-simplify -lcssa -indvars -loop-idiom -loop-deletion
> >> -function_tti -loop-unroll -memdep -mldst-motion -domtree -memdep
> >> -gvn -memdep -memcpyopt -sccp -instcombine -lazy-value-info
> >> -jump-threading -correlated-propagation -domtree -memdep -dse -adce
> >> -simplifycfg -instcombine -barrier -domtree -loops -loop-simplify
> >> -lcssa -branch-prob -block-freq -scalar-evolution -loop-vectorize
> >> -instcombine -domtree -loops -scalar-evolution -slp-vectorizer
> >> -simplifycfg -instcombine -domtree -loops -loop-simplify -lcssa
> >> -scalar-evolution -function_tti -loop-unroll
> >> -alignment-from-assumptions -strip-dead-prototypes -globaldce
> >> -constmerge -verify -verify-di -print-module
> >> Target Library Information
> >> Data Layout
> >> No target information
> >> Target independent code generator's TTI
> >> X86 Target Transform Info
> >> No Alias Analysis (always returns 'may' alias)
> >> Type-Based Alias Analysis
> >> Scoped NoAlias Alias Analysis
> >> Assumption Tracker
> >> Basic Alias Analysis (stateless AA impl)
> >> ModulePass Manager
> >> Debug Info Verifier
> >> Interprocedural Sparse Conditional Constant Propagation
> >> Global Variable Optimizer
> >> Dead Argument Elimination
> >> FunctionPass Manager
> >> Combine redundant instructions
> >> Simplify the CFG
> >> CallGraph Construction
> >> Call Graph SCC Pass Manager
> >> Remove unused exception handling info
> >> Inline Cost Analysis
> >> Function Integration/Inlining
> >> Deduce function attributes
> >> FunctionPass Manager
> >> SROA
> >> Dominator Tree Construction
> >> Early CSE
> >> Lazy Value Information Analysis
> >> Jump Threading
> >> Value Propagation
> >> Simplify the CFG
> >> Combine redundant instructions
> >> Tail Call Elimination
> >> Simplify the CFG
> >> Reassociate expressions
> >> Dominator Tree Construction
> >> Natural Loop Information
> >> Canonicalize natural loops
> >> Loop-Closed SSA Form Pass
> >> Loop Pass Manager
> >> Rotate Loops
> >> Loop Invariant Code Motion
> >> Unswitch loops
> >> Combine redundant instructions
> >> + Dominator Tree Construction
> >> + Natural Loop Information
> >> Scalar Evolution Analysis
> >> + Canonicalize natural loops
> >> Loop-Closed SSA Form Pass
> >> Loop Pass Manager
> >> Induction Variable Simplification
> >> Recognize loop idioms
> >> Delete dead loops
> >> Function TargetTransformInfo
> >> Loop Pass Manager
> >> Unroll loops
> >> Memory Dependence Analysis
> >> MergedLoadStoreMotion
> >> Dominator Tree Construction
> >> Memory Dependence Analysis
> >> Global Value Numbering
> >> Memory Dependence Analysis
> >> MemCpy Optimization
> >> Sparse Conditional Constant Propagation
> >> Combine redundant instructions
> >> Lazy Value Information Analysis
> >> Jump Threading
> >> Value Propagation
> >> Dominator Tree Construction
> >> Memory Dependence Analysis
> >> Dead Store Elimination
> >> Aggressive Dead Code Elimination
> >> Simplify the CFG
> >> Combine redundant instructions
> >> A No-Op Barrier Pass
> >> FunctionPass Manager
> >> Dominator Tree Construction
> >> Natural Loop Information
> >> Canonicalize natural loops
> >> Loop-Closed SSA Form Pass
> >> Branch Probability Analysis
> >> Block Frequency Analysis
> >> Scalar Evolution Analysis
> >> Loop Vectorization
> >> Combine redundant instructions
> >> + Dominator Tree Construction
> >> + Natural Loop Information
> >> Scalar Evolution Analysis
> >> SLP Vectorizer
> >> Simplify the CFG
> >> Combine redundant instructions
> >> Dominator Tree Construction
> >> Natural Loop Information
> >> Canonicalize natural loops
> >> Loop-Closed SSA Form Pass
> >> Scalar Evolution Analysis
> >> Function TargetTransformInfo
> >> Loop Pass Manager
> >> Unroll loops
> >> Alignment from assumptions
> >> Strip Unused Function Prototypes
> >> Dead Global Elimination
> >> Merge Duplicate Global Constants
> >> FunctionPass Manager
> >> Module Verifier
> >> Debug Info Verifier
> >> Print module to stderr
> >>
> >>
> >>
> >>
> >> On Mon, Nov 3, 2014 at 1:14 PM, Andrew Trick < atrick at apple.com >
> >> wrote:
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >> On Nov 3, 2014, at 11:48 AM, Mark Heffernan < meheff at google.com >
> >> wrote:
> >>
> >>
> >> Hi Andrew,
> >>
> >>
> >> (Let me know if someone else is more appropriate to review this).
> >> This patch removes setPreservesCFG() from the instcombine pass. The
> >> reason is that InstCombine can modify terminator instructions
> >> (branches) specifically the predicate. Instcombine clears out dead
> >> blocks and replaces all the dead instruction uses with undef:
> >>
> >>
> >>
> http://llvm.org/docs/doxygen/html/InstructionCombining_8cpp_source.html#l02792
> >>
> >>
> >>
> >> These new undefs can be predicates of branches, and from the comment
> >> of setPreservesCFG():
> >>
> >>
> >>
> >> // setPreservesCFG - This function should be called to by the pass,
> >> iff they do
> >> // not:
> >> //
> >> // 1. Add or remove basic blocks from the function
> >> // 2. Modify terminator instructions in any way.
> >>
> >>
> >> Clearly instcombine is modifying terminator instructions so should
> >> not be calling setPreservesCFG(). The specific issue I ran into was
> >> a dead loop not being deleted because of stale analysis data. Repro:
> >>
> >>
> >>
> >> void foo(int *a, int start_x, int end_x, int start_y, int end_y, int
> >> k) {
> >> for (int y = start_y; y < end_y; y++) {
> >> for (int x = start_x; x < end_x; x++) {
> >> for (int i = 0; i < 1000; i++) {
> >> a[x + y + k*i] = 0;
> >> }
> >> }
> >> }
> >> }
> >>
> >>
> >> # Dead loop left after loop deletion:
> >> clang -S -emit-llvm foo.c -o - | opt -O1 -loop-rotate -loop-unswitch
> >> -instcombine -loop-deletion -S > bad.ll
> >>
> >>
> >>
> >> # Dead loop removed by splitting out loop-deletion into separate opt
> >> invocation:
> >> clang -S -emit-llvm foo.c -o - | opt -O1 -loop-rotate -loop-unswitch
> >> -instcombine | opt -loop-deletion -S > good.ll
> >>
> >>
> >>
> >> This patch fixes this discrepancy.
> >>
> >> This seems totally reasonable given the current (lack of)
> >> invalidation mechanisms.
> >>
> >>
> >> For the record, can you report the difference between
> >> -debug-pass=Structure before and after output?
> >>
> >>
> >> -Andy
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >> Mark <no_cfg_preserve_instcombine.patch>
> >>
> >>
> >> _______________________________________________
> >> llvm-commits mailing list
> >> llvm-commits at cs.uiuc.edu
> >> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
> >>
> >
> > --
> > Hal Finkel
> > Assistant Computational Scientist
> > Leadership Computing Facility
> > Argonne National Laboratory
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20141103/5920d7c0/attachment.html>


More information about the llvm-commits mailing list