[llvm] r294398 - CVP: Make CVP iterate in an order that maximizes reuse of LVI cache

Tue Feb 7 18:35:07 PST 2017

Author: dannyb
Date: Tue Feb  7 20:35:07 2017
New Revision: 294398

URL: http://llvm.org/viewvc/llvm-project?rev=294398&view=rev
Log:
CVP: Make CVP iterate in an order that maximizes reuse of LVI cache

Summary:
After the DFS order change for LVI, i have a few testcases that now
take forever.

The TL;DR - This is mainly due to the overdefined cache, but that
requires predicateinfo to fix[1]

In order to maximize reuse of the LVI cache for now, change the order
we iterate in.

This reduces my testcase from 5 minutes to 4 seconds.

I have verified cases like gmic do not get slower.

I am playing with whether the order should be postorder or idf.

[1] In practice, overdefined anywhere should be overdefined
everywhere, so this cache should be global.  That also fixes this bug.
The problem, however, is that LVI relies on this cache being filled in
per-block because it wants different values in different blocks due to
precisely the naming issue that predicateinfo fixes.  With
predicateinfo, making the cache global works fine on individual
passes, and also resolves this issue.

Reviewers: davide, sanjoy, chandlerc

Subscribers: llvm-commits, djasper

Differential Revision: https://reviews.llvm.org/D29679

Modified:
    llvm/trunk/lib/Transforms/Scalar/CorrelatedValuePropagation.cpp

Modified: llvm/trunk/lib/Transforms/Scalar/CorrelatedValuePropagation.cpp
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Scalar/CorrelatedValuePropagation.cpp?rev=294398&r1=294397&r2=294398&view=diff
==============================================================================

--- llvm/trunk/lib/Transforms/Scalar/CorrelatedValuePropagation.cpp (original)
+++ llvm/trunk/lib/Transforms/Scalar/CorrelatedValuePropagation.cpp Tue Feb  7 20:35:07 2017
@@ -481,12 +481,11 @@ static Constant *getConstantAt(Value *V,
 static bool runImpl(Function &F, LazyValueInfo *LVI) {
   bool FnChanged = false;
 
-  // Visiting in a pre-order depth-first traversal causes us to simplify early
-  // blocks before querying later blocks (which require us to analyze early
-  // blocks).  Eagerly simplifying shallow blocks means there is strictly less
-  // work to do for deep blocks.  This also means we don't visit unreachable
-  // blocks. 
-  for (BasicBlock *BB : depth_first(&F.getEntryBlock())) {
+  // Visiting in an inverse depth first traversal maximizes reuse of the LVI
+  // cache, as it currently iterates in depth first order upwards, caching
+  // overdefined info as it goes.
+
+  for (BasicBlock *BB : inverse_depth_first(&F.getEntryBlock())) {
     bool BBChanged = false;
     for (BasicBlock::iterator BI = BB->begin(), BE = BB->end(); BI != BE;) {
       Instruction *II = &*BI++;