[PATCH][instcombine] Remove preservesCFG from instcombine

Andrew Trick atrick at apple.com
Mon Nov 3 17:25:07 PST 2014


> On Nov 3, 2014, at 4:24 PM, Mark Heffernan <meheff at google.com> wrote:
> 
> On Mon, Nov 3, 2014 at 3:44 PM, Andrew Trick <atrick at apple.com <mailto:atrick at apple.com>> wrote:
> 
> > On Nov 3, 2014, at 2:49 PM, Hal Finkel <hfinkel at anl.gov <mailto:hfinkel at anl.gov>> wrote:
> >
> > Can we still have InstCombine preserve the DomTree?
> 
> InstCombine *really* should preserve DomTree and LoopInfo. But one of those may be causing the missed optimization. Maybe DominatorTree::isReachableFromEntry needs to be updated? That could in turn affect LoopInfo or LCSSA.
> 
> I think DomTree is preserved.  In fact, there is a test which fails if I *don't* preserve DomTree: test/Transforms/InstCombine/assume-loop-align.ll.  Hal, you added this test recently.  This test breaking seems like a bug somewhere.  Any idea why this test would break like this?
> 
> The missed optimization I was seeing (dead loop not deleted) was due to loop simplify not running.  After my patch here's the chain of events resulting in the loop being deleted:
> 1) InstCombine runs replacing latch branch predicate with undef ("bra undef")
> 2) Loop simplify turns this into always exit at the latch: either "bra true" or "bra false".  Without my patch loop simplify never runs.
> 3) SCEV can now reason about the loop and determine a trip count which is trivially one.
> 4) loop deletion deletes loop.  (loop-deletion requires a trip count to avoid removing infinite loops)
> 
> It looks like the only thing not being preserved here is LoopSimplify.  So I'll add domtree and loopinfo preserved to the patch.

Sounds great. That makes perfect sense. Thanks for the explanation.

-Andy

> 
> Mark
> 
> 
> I say, if fixing that is out-of-scope for this change, please file a PR for adding an API that updates DomTree and LoopInfo when a branch condition changes (to undef).
> 
> -Andy
> 
> >
> > -Hal
> >
> > ----- Original Message -----
> >> From: "Mark Heffernan" <meheff at google.com <mailto:meheff at google.com>>
> >> To: "Andrew Trick" <atrick at apple.com <mailto:atrick at apple.com>>
> >> Cc: "llvm commits" <llvm-commits at cs.uiuc.edu <mailto:llvm-commits at cs.uiuc.edu>>
> >> Sent: Monday, November 3, 2014 4:36:01 PM
> >> Subject: Re: [PATCH][instcombine] Remove preservesCFG from instcombine
> >>
> >>
> >>
> >>
> >> Here's the -debug-pass=Structure output from an O2 build of the repro
> >> code I posted earlier. The '+' lines are new with the change:
> >>
> >>
> >>
> >> Pass Arguments: -datalayout -notti -basictti -x86tti -no-aa -tbaa
> >> -scoped-noalias -assumption-tracker -targetlibinfo -basicaa -verify
> >> -simplifycfg -domtree -sroa -early-cse -lower-expect
> >> Data Layout
> >> No target information
> >> Target independent code generator's TTI
> >> X86 Target Transform Info
> >> No Alias Analysis (always returns 'may' alias)
> >> Type-Based Alias Analysis
> >> Scoped NoAlias Alias Analysis
> >> Assumption Tracker
> >> Target Library Information
> >> Basic Alias Analysis (stateless AA impl)
> >> FunctionPass Manager
> >> Module Verifier
> >> Simplify the CFG
> >> Dominator Tree Construction
> >> SROA
> >> Early CSE
> >> Lower 'expect' Intrinsics
> >> -Pass Arguments: -targetlibinfo -datalayout -notti -basictti -x86tti
> >> -no-aa -tbaa -scoped-noalias -assumption-tracker -basicaa -verify-di
> >> -ipsccp -globalopt -deadargelim -instcombine -simplifycfg -basiccg
> >> -prune-eh -inline-cost -inline -functionattrs -sroa -domtree
> >> -early-cse -lazy-value-info -jump-threading -correlated-propagation
> >> -simplifycfg -instcombine -tailcallelim -simplifycfg -reassociate
> >> -domtree -loops -loop-simplify -lcssa -loop-rotate -licm
> >> -loop-unswitch -instcombine -scalar-evolution -lcssa -indvars
> >> -loop-idiom -loop-deletion -function_tti -loop-unroll -memdep
> >> -mldst-motion -domtree -memdep -gvn -memdep -memcpyopt -sccp
> >> -instcombine -lazy-value-info -jump-threading
> >> -correlated-propagation -domtree -memdep -dse -adce -simplifycfg
> >> -instcombine -barrier -domtree -loops -loop-simplify -lcssa
> >> -branch-prob -block-freq -scalar-evolution -loop-vectorize
> >> -instcombine -scalar-evolution -slp-vectorizer -simplifycfg
> >> -instcombine -domtree -loops -loop-simplify -lcssa -scalar-evolution
> >> -function_tti -loop-unroll -alignment-from-assumptions
> >> -strip-dead-prototypes -globaldce -constmerge -verify -verify-di
> >> -print-module
> >> +Pass Arguments: -targetlibinfo -datalayout -notti -basictti -x86tti
> >> -no-aa -tbaa -scoped-noalias -assumption-tracker -basicaa -verify-di
> >> -ipsccp -globalopt -deadargelim -instcombine -simplifycfg -basiccg
> >> -prune-eh -inline-cost -inline -functionattrs -sroa -domtree
> >> -early-cse -lazy-value-info -jump-threading -correlated-propagation
> >> -simplifycfg -instcombine -tailcallelim -simplifycfg -reassociate
> >> -domtree -loops -loop-simplify -lcssa -loop-rotate -licm
> >> -loop-unswitch -instcombine -domtree -loops -scalar-evolution
> >> -loop-simplify -lcssa -indvars -loop-idiom -loop-deletion
> >> -function_tti -loop-unroll -memdep -mldst-motion -domtree -memdep
> >> -gvn -memdep -memcpyopt -sccp -instcombine -lazy-value-info
> >> -jump-threading -correlated-propagation -domtree -memdep -dse -adce
> >> -simplifycfg -instcombine -barrier -domtree -loops -loop-simplify
> >> -lcssa -branch-prob -block-freq -scalar-evolution -loop-vectorize
> >> -instcombine -domtree -loops -scalar-evolution -slp-vectorizer
> >> -simplifycfg -instcombine -domtree -loops -loop-simplify -lcssa
> >> -scalar-evolution -function_tti -loop-unroll
> >> -alignment-from-assumptions -strip-dead-prototypes -globaldce
> >> -constmerge -verify -verify-di -print-module
> >> Target Library Information
> >> Data Layout
> >> No target information
> >> Target independent code generator's TTI
> >> X86 Target Transform Info
> >> No Alias Analysis (always returns 'may' alias)
> >> Type-Based Alias Analysis
> >> Scoped NoAlias Alias Analysis
> >> Assumption Tracker
> >> Basic Alias Analysis (stateless AA impl)
> >> ModulePass Manager
> >> Debug Info Verifier
> >> Interprocedural Sparse Conditional Constant Propagation
> >> Global Variable Optimizer
> >> Dead Argument Elimination
> >> FunctionPass Manager
> >> Combine redundant instructions
> >> Simplify the CFG
> >> CallGraph Construction
> >> Call Graph SCC Pass Manager
> >> Remove unused exception handling info
> >> Inline Cost Analysis
> >> Function Integration/Inlining
> >> Deduce function attributes
> >> FunctionPass Manager
> >> SROA
> >> Dominator Tree Construction
> >> Early CSE
> >> Lazy Value Information Analysis
> >> Jump Threading
> >> Value Propagation
> >> Simplify the CFG
> >> Combine redundant instructions
> >> Tail Call Elimination
> >> Simplify the CFG
> >> Reassociate expressions
> >> Dominator Tree Construction
> >> Natural Loop Information
> >> Canonicalize natural loops
> >> Loop-Closed SSA Form Pass
> >> Loop Pass Manager
> >> Rotate Loops
> >> Loop Invariant Code Motion
> >> Unswitch loops
> >> Combine redundant instructions
> >> + Dominator Tree Construction
> >> + Natural Loop Information
> >> Scalar Evolution Analysis
> >> + Canonicalize natural loops
> >> Loop-Closed SSA Form Pass
> >> Loop Pass Manager
> >> Induction Variable Simplification
> >> Recognize loop idioms
> >> Delete dead loops
> >> Function TargetTransformInfo
> >> Loop Pass Manager
> >> Unroll loops
> >> Memory Dependence Analysis
> >> MergedLoadStoreMotion
> >> Dominator Tree Construction
> >> Memory Dependence Analysis
> >> Global Value Numbering
> >> Memory Dependence Analysis
> >> MemCpy Optimization
> >> Sparse Conditional Constant Propagation
> >> Combine redundant instructions
> >> Lazy Value Information Analysis
> >> Jump Threading
> >> Value Propagation
> >> Dominator Tree Construction
> >> Memory Dependence Analysis
> >> Dead Store Elimination
> >> Aggressive Dead Code Elimination
> >> Simplify the CFG
> >> Combine redundant instructions
> >> A No-Op Barrier Pass
> >> FunctionPass Manager
> >> Dominator Tree Construction
> >> Natural Loop Information
> >> Canonicalize natural loops
> >> Loop-Closed SSA Form Pass
> >> Branch Probability Analysis
> >> Block Frequency Analysis
> >> Scalar Evolution Analysis
> >> Loop Vectorization
> >> Combine redundant instructions
> >> + Dominator Tree Construction
> >> + Natural Loop Information
> >> Scalar Evolution Analysis
> >> SLP Vectorizer
> >> Simplify the CFG
> >> Combine redundant instructions
> >> Dominator Tree Construction
> >> Natural Loop Information
> >> Canonicalize natural loops
> >> Loop-Closed SSA Form Pass
> >> Scalar Evolution Analysis
> >> Function TargetTransformInfo
> >> Loop Pass Manager
> >> Unroll loops
> >> Alignment from assumptions
> >> Strip Unused Function Prototypes
> >> Dead Global Elimination
> >> Merge Duplicate Global Constants
> >> FunctionPass Manager
> >> Module Verifier
> >> Debug Info Verifier
> >> Print module to stderr
> >>
> >>
> >>
> >>
> >> On Mon, Nov 3, 2014 at 1:14 PM, Andrew Trick < atrick at apple.com <mailto:atrick at apple.com> >
> >> wrote:
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >> On Nov 3, 2014, at 11:48 AM, Mark Heffernan < meheff at google.com <mailto:meheff at google.com> >
> >> wrote:
> >>
> >>
> >> Hi Andrew,
> >>
> >>
> >> (Let me know if someone else is more appropriate to review this).
> >> This patch removes setPreservesCFG() from the instcombine pass. The
> >> reason is that InstCombine can modify terminator instructions
> >> (branches) specifically the predicate. Instcombine clears out dead
> >> blocks and replaces all the dead instruction uses with undef:
> >>
> >>
> >> http://llvm.org/docs/doxygen/html/InstructionCombining_8cpp_source.html#l02792 <http://llvm.org/docs/doxygen/html/InstructionCombining_8cpp_source.html#l02792>
> >>
> >>
> >>
> >> These new undefs can be predicates of branches, and from the comment
> >> of setPreservesCFG():
> >>
> >>
> >>
> >> // setPreservesCFG - This function should be called to by the pass,
> >> iff they do
> >> // not:
> >> //
> >> // 1. Add or remove basic blocks from the function
> >> // 2. Modify terminator instructions in any way.
> >>
> >>
> >> Clearly instcombine is modifying terminator instructions so should
> >> not be calling setPreservesCFG(). The specific issue I ran into was
> >> a dead loop not being deleted because of stale analysis data. Repro:
> >>
> >>
> >>
> >> void foo(int *a, int start_x, int end_x, int start_y, int end_y, int
> >> k) {
> >> for (int y = start_y; y < end_y; y++) {
> >> for (int x = start_x; x < end_x; x++) {
> >> for (int i = 0; i < 1000; i++) {
> >> a[x + y + k*i] = 0;
> >> }
> >> }
> >> }
> >> }
> >>
> >>
> >> # Dead loop left after loop deletion:
> >> clang -S -emit-llvm foo.c -o - | opt -O1 -loop-rotate -loop-unswitch
> >> -instcombine -loop-deletion -S > bad.ll
> >>
> >>
> >>
> >> # Dead loop removed by splitting out loop-deletion into separate opt
> >> invocation:
> >> clang -S -emit-llvm foo.c -o - | opt -O1 -loop-rotate -loop-unswitch
> >> -instcombine | opt -loop-deletion -S > good.ll
> >>
> >>
> >>
> >> This patch fixes this discrepancy.
> >>
> >> This seems totally reasonable given the current (lack of)
> >> invalidation mechanisms.
> >>
> >>
> >> For the record, can you report the difference between
> >> -debug-pass=Structure before and after output?
> >>
> >>
> >> -Andy
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >> Mark <no_cfg_preserve_instcombine.patch>
> >>
> >>
> >> _______________________________________________
> >> llvm-commits mailing list
> >> llvm-commits at cs.uiuc.edu <mailto:llvm-commits at cs.uiuc.edu>
> >> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits <http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits>
> >>
> >
> > --
> > Hal Finkel
> > Assistant Computational Scientist
> > Leadership Computing Facility
> > Argonne National Laboratory

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20141103/9029c392/attachment.html>


More information about the llvm-commits mailing list