[llvm] r181532 - [GVN] Split critical-edge on the fly, instead of postpone edge-splitting to next

Shuxin Yang shuxin.llvm at gmail.com
Thu May 9 14:37:28 PDT 2013


GVN is very expensive, and 2/3 the time comes from dense 
memory-dependence query.
It currently iteratively goes through all blocks top-down until no 
change take place.

My ultimate goal is to speed it up converting iterative algorithm into a 
one-pass algorithm.
If we can speedup GVN by 2x, we can afford invoke it twice, once earlier 
(say, somewhere
around EarlyCSE), another one stay where it is today.

There are couple of reasons why *this* GVN needs to be iterative, I will 
explain the tricky reasons
in my subsequent commits.

For this specific change, it is pretty straightforward:

    before my change, the logic was:
    ---------------------------------
       ... performing poor-man's PRE for the load...
       check if there are critical-edge between the source-block and 
target block (which
        must be the immediate predecessor of the source block).

       if critical-edge present {
           collect the critical-edge.
           give up.
           // next itearation GVN will split the critical edge, and then 
go ahead perform the PRE.
       }
      ----------------------------

     and the logic is:
     -------------
          ... performing poor-man's PRE for load...
          check critical-edge presence.
          if (yes) {
               split the critical-edge !!!!! split critical edge on the 
fly !!!!!!
          }
          go ahead performing PRE
     -------------------------



On 5/9/13 2:16 PM, Eric Christopher wrote:
> Sounds interesting. What's your plan here? Have an outline of the new algorithm?
>
> -eric
>
> On Thu, May 9, 2013 at 11:34 AM, Shuxin Yang <shuxin.llvm at gmail.com> wrote:
>> Author: shuxin_yang
>> Date: Thu May  9 13:34:27 2013
>> New Revision: 181532
>>
>> URL: http://llvm.org/viewvc/llvm-project?rev=181532&view=rev
>> Log:
>> [GVN] Split critical-edge on the fly, instead of postpone edge-splitting to next
>>    iteration.
>>
>>    This on step toward non-iterative GVN. My local hack suggests that getting rid
>> of iteration will speedup GVN by 30%+ on a medium sized input (2k LOC, C++).
>> I cannot explain why not 2x or more at this moment.
>>
>> Modified:
>>      llvm/trunk/lib/Transforms/Scalar/GVN.cpp
>>
>> Modified: llvm/trunk/lib/Transforms/Scalar/GVN.cpp
>> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Scalar/GVN.cpp?rev=181532&r1=181531&r2=181532&view=diff
>> ==============================================================================
>> --- llvm/trunk/lib/Transforms/Scalar/GVN.cpp (original)
>> +++ llvm/trunk/lib/Transforms/Scalar/GVN.cpp Thu May  9 13:34:27 2013
>> @@ -45,6 +45,7 @@
>>   #include "llvm/Target/TargetLibraryInfo.h"
>>   #include "llvm/Transforms/Utils/BasicBlockUtils.h"
>>   #include "llvm/Transforms/Utils/SSAUpdater.h"
>> +#include <vector>
>>   using namespace llvm;
>>   using namespace PatternMatch;
>>
>> @@ -692,6 +693,7 @@ namespace {
>>       void cleanupGlobalSets();
>>       void verifyRemoved(const Instruction *I) const;
>>       bool splitCriticalEdges();
>> +    BasicBlock *splitCriticalEdges(BasicBlock *Pred, BasicBlock *Succ);
>>       unsigned replaceAllDominatedUsesWith(Value *From, Value *To,
>>                                            const BasicBlockEdge &Root);
>>       bool propagateEquality(Value *LHS, Value *RHS, const BasicBlockEdge &Root);
>> @@ -1513,7 +1515,7 @@ bool GVN::PerformLoadPRE(LoadInst *LI, A
>>     for (unsigned i = 0, e = UnavailableBlocks.size(); i != e; ++i)
>>       FullyAvailableBlocks[UnavailableBlocks[i]] = false;
>>
>> -  SmallVector<std::pair<TerminatorInst*, unsigned>, 4> NeedToSplit;
>> +  SmallVector<BasicBlock *, 4> CriticalEdgePred;
>>     for (pred_iterator PI = pred_begin(LoadBB), E = pred_end(LoadBB);
>>          PI != E; ++PI) {
>>       BasicBlock *Pred = *PI;
>> @@ -1536,20 +1538,14 @@ bool GVN::PerformLoadPRE(LoadInst *LI, A
>>           return false;
>>         }
>>
>> -      unsigned SuccNum = GetSuccessorNumber(Pred, LoadBB);
>> -      NeedToSplit.push_back(std::make_pair(Pred->getTerminator(), SuccNum));
>> +      CriticalEdgePred.push_back(Pred);
>>       }
>>     }
>>
>> -  if (!NeedToSplit.empty()) {
>> -    toSplit.append(NeedToSplit.begin(), NeedToSplit.end());
>> -    return false;
>> -  }
>> -
>>     // Decide whether PRE is profitable for this load.
>>     unsigned NumUnavailablePreds = PredLoads.size();
>>     assert(NumUnavailablePreds != 0 &&
>> -         "Fully available value should be eliminated above!");
>> +         "Fully available value should already be eliminated!");
>>
>>     // If this load is unavailable in multiple predecessors, reject it.
>>     // FIXME: If we could restructure the CFG, we could make a common pred with
>> @@ -1558,6 +1554,17 @@ bool GVN::PerformLoadPRE(LoadInst *LI, A
>>     if (NumUnavailablePreds != 1)
>>         return false;
>>
>> +  // Split critical edges, and update the unavailable predecessors accordingly.
>> +  for (SmallVector<BasicBlock *, 4>::iterator I = CriticalEdgePred.begin(),
>> +         E = CriticalEdgePred.end(); I != E; I++) {
>> +    BasicBlock *OrigPred = *I;
>> +    BasicBlock *NewPred = splitCriticalEdges(OrigPred, LoadBB);
>> +    PredLoads.erase(OrigPred);
>> +    PredLoads[NewPred] = 0;
>> +    DEBUG(dbgs() << "Split critical edge " << OrigPred->getName() << "->"
>> +                 << LoadBB->getName() << '\n');
>> +  }
>> +
>>     // Check if the load can safely be moved to all the unavailable predecessors.
>>     bool CanDoPRE = true;
>>     SmallVector<Instruction*, 8> NewInsts;
>> @@ -1594,7 +1601,9 @@ bool GVN::PerformLoadPRE(LoadInst *LI, A
>>         if (MD) MD->removeInstruction(I);
>>         I->eraseFromParent();
>>       }
>> -    return false;
>> +    // HINT:Don't revert the edge-splitting as following transformation may
>> +    // also need to split these critial edges.
>> +    return !CriticalEdgePred.empty();
>>     }
>>
>>     // Okay, we can eliminate this load by inserting a reload in the predecessor
>> @@ -2297,8 +2306,6 @@ bool GVN::runOnFunction(Function& F) {
>>     while (ShouldContinue) {
>>       DEBUG(dbgs() << "GVN iteration: " << Iteration << "\n");
>>       ShouldContinue = iterateOnFunction(F);
>> -    if (splitCriticalEdges())
>> -      ShouldContinue = true;
>>       Changed |= ShouldContinue;
>>       ++Iteration;
>>     }
>> @@ -2310,6 +2317,7 @@ bool GVN::runOnFunction(Function& F) {
>>         Changed |= PREChanged;
>>       }
>>     }
>> +
>>     // FIXME: Should perform GVN again after PRE does something.  PRE can move
>>     // computations into blocks where they become fully redundant.  Note that
>>     // we can't do this until PRE's critical edge splitting updates memdep.
>> @@ -2543,6 +2551,15 @@ bool GVN::performPRE(Function &F) {
>>     return Changed;
>>   }
>>
>> +/// Split the critical edge connecting the given two blocks, and return
>> +/// the block inserted to the critical edge.
>> +BasicBlock *GVN::splitCriticalEdges(BasicBlock *Pred, BasicBlock *Succ) {
>> +  BasicBlock *BB = SplitCriticalEdge(Pred, Succ, this);
>> +  if (MD)
>> +    MD->invalidateCachedPredecessors();
>> +  return BB;
>> +}
>> +
>>   /// splitCriticalEdges - Split critical edges found during the previous
>>   /// iteration that may enable further optimization.
>>   bool GVN::splitCriticalEdges() {
>> @@ -2569,9 +2586,18 @@ bool GVN::iterateOnFunction(Function &F)
>>          RE = RPOT.end(); RI != RE; ++RI)
>>       Changed |= processBlock(*RI);
>>   #else
>> +  // Save the blocks this function have before transformation begins. GVN may
>> +  // split critical edge, and hence may invalidate the RPO/DT iterator.
>> +  //
>> +  std::vector<BasicBlock *> BBVect;
>> +  BBVect.reserve(256);
>>     for (df_iterator<DomTreeNode*> DI = df_begin(DT->getRootNode()),
>>          DE = df_end(DT->getRootNode()); DI != DE; ++DI)
>> -    Changed |= processBlock(DI->getBlock());
>> +    BBVect.push_back(DI->getBlock());
>> +
>> +  for (std::vector<BasicBlock *>::iterator I = BBVect.begin(), E = BBVect.end();
>> +       I != E; I++)
>> +    Changed |= processBlock(*I);
>>   #endif
>>
>>     return Changed;
>>
>>
>> _______________________________________________
>> llvm-commits mailing list
>> llvm-commits at cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits




More information about the llvm-commits mailing list