[PATCH][instcombine] Remove preservesCFG from instcombine

Andrew Trick atrick at apple.com
Mon Nov 3 15:44:31 PST 2014


> On Nov 3, 2014, at 2:49 PM, Hal Finkel <hfinkel at anl.gov> wrote:
> 
> Can we still have InstCombine preserve the DomTree?

InstCombine *really* should preserve DomTree and LoopInfo. But one of those may be causing the missed optimization. Maybe DominatorTree::isReachableFromEntry needs to be updated? That could in turn affect LoopInfo or LCSSA.

I say, if fixing that is out-of-scope for this change, please file a PR for adding an API that updates DomTree and LoopInfo when a branch condition changes (to undef).

-Andy

> 
> -Hal
> 
> ----- Original Message -----
>> From: "Mark Heffernan" <meheff at google.com>
>> To: "Andrew Trick" <atrick at apple.com>
>> Cc: "llvm commits" <llvm-commits at cs.uiuc.edu>
>> Sent: Monday, November 3, 2014 4:36:01 PM
>> Subject: Re: [PATCH][instcombine] Remove preservesCFG from instcombine
>> 
>> 
>> 
>> 
>> Here's the -debug-pass=Structure output from an O2 build of the repro
>> code I posted earlier. The '+' lines are new with the change:
>> 
>> 
>> 
>> Pass Arguments: -datalayout -notti -basictti -x86tti -no-aa -tbaa
>> -scoped-noalias -assumption-tracker -targetlibinfo -basicaa -verify
>> -simplifycfg -domtree -sroa -early-cse -lower-expect
>> Data Layout
>> No target information
>> Target independent code generator's TTI
>> X86 Target Transform Info
>> No Alias Analysis (always returns 'may' alias)
>> Type-Based Alias Analysis
>> Scoped NoAlias Alias Analysis
>> Assumption Tracker
>> Target Library Information
>> Basic Alias Analysis (stateless AA impl)
>> FunctionPass Manager
>> Module Verifier
>> Simplify the CFG
>> Dominator Tree Construction
>> SROA
>> Early CSE
>> Lower 'expect' Intrinsics
>> -Pass Arguments: -targetlibinfo -datalayout -notti -basictti -x86tti
>> -no-aa -tbaa -scoped-noalias -assumption-tracker -basicaa -verify-di
>> -ipsccp -globalopt -deadargelim -instcombine -simplifycfg -basiccg
>> -prune-eh -inline-cost -inline -functionattrs -sroa -domtree
>> -early-cse -lazy-value-info -jump-threading -correlated-propagation
>> -simplifycfg -instcombine -tailcallelim -simplifycfg -reassociate
>> -domtree -loops -loop-simplify -lcssa -loop-rotate -licm
>> -loop-unswitch -instcombine -scalar-evolution -lcssa -indvars
>> -loop-idiom -loop-deletion -function_tti -loop-unroll -memdep
>> -mldst-motion -domtree -memdep -gvn -memdep -memcpyopt -sccp
>> -instcombine -lazy-value-info -jump-threading
>> -correlated-propagation -domtree -memdep -dse -adce -simplifycfg
>> -instcombine -barrier -domtree -loops -loop-simplify -lcssa
>> -branch-prob -block-freq -scalar-evolution -loop-vectorize
>> -instcombine -scalar-evolution -slp-vectorizer -simplifycfg
>> -instcombine -domtree -loops -loop-simplify -lcssa -scalar-evolution
>> -function_tti -loop-unroll -alignment-from-assumptions
>> -strip-dead-prototypes -globaldce -constmerge -verify -verify-di
>> -print-module
>> +Pass Arguments: -targetlibinfo -datalayout -notti -basictti -x86tti
>> -no-aa -tbaa -scoped-noalias -assumption-tracker -basicaa -verify-di
>> -ipsccp -globalopt -deadargelim -instcombine -simplifycfg -basiccg
>> -prune-eh -inline-cost -inline -functionattrs -sroa -domtree
>> -early-cse -lazy-value-info -jump-threading -correlated-propagation
>> -simplifycfg -instcombine -tailcallelim -simplifycfg -reassociate
>> -domtree -loops -loop-simplify -lcssa -loop-rotate -licm
>> -loop-unswitch -instcombine -domtree -loops -scalar-evolution
>> -loop-simplify -lcssa -indvars -loop-idiom -loop-deletion
>> -function_tti -loop-unroll -memdep -mldst-motion -domtree -memdep
>> -gvn -memdep -memcpyopt -sccp -instcombine -lazy-value-info
>> -jump-threading -correlated-propagation -domtree -memdep -dse -adce
>> -simplifycfg -instcombine -barrier -domtree -loops -loop-simplify
>> -lcssa -branch-prob -block-freq -scalar-evolution -loop-vectorize
>> -instcombine -domtree -loops -scalar-evolution -slp-vectorizer
>> -simplifycfg -instcombine -domtree -loops -loop-simplify -lcssa
>> -scalar-evolution -function_tti -loop-unroll
>> -alignment-from-assumptions -strip-dead-prototypes -globaldce
>> -constmerge -verify -verify-di -print-module
>> Target Library Information
>> Data Layout
>> No target information
>> Target independent code generator's TTI
>> X86 Target Transform Info
>> No Alias Analysis (always returns 'may' alias)
>> Type-Based Alias Analysis
>> Scoped NoAlias Alias Analysis
>> Assumption Tracker
>> Basic Alias Analysis (stateless AA impl)
>> ModulePass Manager
>> Debug Info Verifier
>> Interprocedural Sparse Conditional Constant Propagation
>> Global Variable Optimizer
>> Dead Argument Elimination
>> FunctionPass Manager
>> Combine redundant instructions
>> Simplify the CFG
>> CallGraph Construction
>> Call Graph SCC Pass Manager
>> Remove unused exception handling info
>> Inline Cost Analysis
>> Function Integration/Inlining
>> Deduce function attributes
>> FunctionPass Manager
>> SROA
>> Dominator Tree Construction
>> Early CSE
>> Lazy Value Information Analysis
>> Jump Threading
>> Value Propagation
>> Simplify the CFG
>> Combine redundant instructions
>> Tail Call Elimination
>> Simplify the CFG
>> Reassociate expressions
>> Dominator Tree Construction
>> Natural Loop Information
>> Canonicalize natural loops
>> Loop-Closed SSA Form Pass
>> Loop Pass Manager
>> Rotate Loops
>> Loop Invariant Code Motion
>> Unswitch loops
>> Combine redundant instructions
>> + Dominator Tree Construction
>> + Natural Loop Information
>> Scalar Evolution Analysis
>> + Canonicalize natural loops
>> Loop-Closed SSA Form Pass
>> Loop Pass Manager
>> Induction Variable Simplification
>> Recognize loop idioms
>> Delete dead loops
>> Function TargetTransformInfo
>> Loop Pass Manager
>> Unroll loops
>> Memory Dependence Analysis
>> MergedLoadStoreMotion
>> Dominator Tree Construction
>> Memory Dependence Analysis
>> Global Value Numbering
>> Memory Dependence Analysis
>> MemCpy Optimization
>> Sparse Conditional Constant Propagation
>> Combine redundant instructions
>> Lazy Value Information Analysis
>> Jump Threading
>> Value Propagation
>> Dominator Tree Construction
>> Memory Dependence Analysis
>> Dead Store Elimination
>> Aggressive Dead Code Elimination
>> Simplify the CFG
>> Combine redundant instructions
>> A No-Op Barrier Pass
>> FunctionPass Manager
>> Dominator Tree Construction
>> Natural Loop Information
>> Canonicalize natural loops
>> Loop-Closed SSA Form Pass
>> Branch Probability Analysis
>> Block Frequency Analysis
>> Scalar Evolution Analysis
>> Loop Vectorization
>> Combine redundant instructions
>> + Dominator Tree Construction
>> + Natural Loop Information
>> Scalar Evolution Analysis
>> SLP Vectorizer
>> Simplify the CFG
>> Combine redundant instructions
>> Dominator Tree Construction
>> Natural Loop Information
>> Canonicalize natural loops
>> Loop-Closed SSA Form Pass
>> Scalar Evolution Analysis
>> Function TargetTransformInfo
>> Loop Pass Manager
>> Unroll loops
>> Alignment from assumptions
>> Strip Unused Function Prototypes
>> Dead Global Elimination
>> Merge Duplicate Global Constants
>> FunctionPass Manager
>> Module Verifier
>> Debug Info Verifier
>> Print module to stderr
>> 
>> 
>> 
>> 
>> On Mon, Nov 3, 2014 at 1:14 PM, Andrew Trick < atrick at apple.com >
>> wrote:
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> On Nov 3, 2014, at 11:48 AM, Mark Heffernan < meheff at google.com >
>> wrote:
>> 
>> 
>> Hi Andrew,
>> 
>> 
>> (Let me know if someone else is more appropriate to review this).
>> This patch removes setPreservesCFG() from the instcombine pass. The
>> reason is that InstCombine can modify terminator instructions
>> (branches) specifically the predicate. Instcombine clears out dead
>> blocks and replaces all the dead instruction uses with undef:
>> 
>> 
>> http://llvm.org/docs/doxygen/html/InstructionCombining_8cpp_source.html#l02792
>> 
>> 
>> 
>> These new undefs can be predicates of branches, and from the comment
>> of setPreservesCFG():
>> 
>> 
>> 
>> // setPreservesCFG - This function should be called to by the pass,
>> iff they do
>> // not:
>> //
>> // 1. Add or remove basic blocks from the function
>> // 2. Modify terminator instructions in any way.
>> 
>> 
>> Clearly instcombine is modifying terminator instructions so should
>> not be calling setPreservesCFG(). The specific issue I ran into was
>> a dead loop not being deleted because of stale analysis data. Repro:
>> 
>> 
>> 
>> void foo(int *a, int start_x, int end_x, int start_y, int end_y, int
>> k) {
>> for (int y = start_y; y < end_y; y++) {
>> for (int x = start_x; x < end_x; x++) {
>> for (int i = 0; i < 1000; i++) {
>> a[x + y + k*i] = 0;
>> }
>> }
>> }
>> }
>> 
>> 
>> # Dead loop left after loop deletion:
>> clang -S -emit-llvm foo.c -o - | opt -O1 -loop-rotate -loop-unswitch
>> -instcombine -loop-deletion -S > bad.ll
>> 
>> 
>> 
>> # Dead loop removed by splitting out loop-deletion into separate opt
>> invocation:
>> clang -S -emit-llvm foo.c -o - | opt -O1 -loop-rotate -loop-unswitch
>> -instcombine | opt -loop-deletion -S > good.ll
>> 
>> 
>> 
>> This patch fixes this discrepancy.
>> 
>> This seems totally reasonable given the current (lack of)
>> invalidation mechanisms.
>> 
>> 
>> For the record, can you report the difference between
>> -debug-pass=Structure before and after output?
>> 
>> 
>> -Andy
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> Mark <no_cfg_preserve_instcombine.patch>
>> 
>> 
>> _______________________________________________
>> llvm-commits mailing list
>> llvm-commits at cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>> 
> 
> -- 
> Hal Finkel
> Assistant Computational Scientist
> Leadership Computing Facility
> Argonne National Laboratory





More information about the llvm-commits mailing list