[llvm-commits] Patch for simplifying regions

Mon Feb 28 08:35:06 PST 2011

On 02/01/2011 07:37 PM, Vu Le wrote:
> Hi Tobias,

I would like to get this one in LLVM 2.9. Do you happen to have an 
updated patch available?

>         +void RegionSimplify::print(raw_ostream&O, const Module *M) const {
>         +  BasicBlock *enteringBlock;
>         +  BasicBlock *exitingBlock;
>         +
>         +  if (modified) {
>         +    enteringBlock = CR->getEnteringBlock();
>         +    exitingBlock = CR->getExitingBlock();
>         +
>         +    O<< "\nRegion: ["<<  CR->getNameStr()<< "] Edges:\n";
>         +    if (enteringBlock)
>         +      O<< "  Entry: ]"<<  enteringBlock->getNameStr()<< " => "
>         +<<  enteringBlock->getNameStr()<< "]\n";
>         +    if (exitingBlock)
>         +      O<< "  Exit: ["<<  exitingBlock->getNameStr()<< " => "
>         +<<  exitingBlock->getNameStr()<< "[\n";
>
>     Can you use CR->getNameStr() instead of formatting this yourself?
>
>
> This is is Andreas' code. Do you mean if the region is modified, we just
> print CR->getNameStr()?

OK. I did not get that you add additional information. This is fine. I 
would just format it like this:

void RegionSimplify::print(raw_ostream &O, const Module *M) const {
   if (!modified)
     return;

   BasicBlock *enteringBlock = CR->getEnteringBlock();
   BasicBlock *exitingBlock = CR->getExitingBlock();

[...]

>         +
>         +    O<< "\n";
>         +  }
>         +}
>         +
>         +void RegionSimplify::getAnalysisUsage(AnalysisUsage&AU) const {
>         +  AU.addPreserved<DominatorTree>  ();
>         +  AU.addRequired<RegionInfo>  ();
>
>     Does this transformation preserve RegionInfo?
>
>
> Yes, I think RegionInfo is also preserved.
OK. So let's state that we preserve it.

>         +// createSingleEntryEdge - Split the entry BasicBlock of the given
>         +// region after the last PHINode to form a single entry edge.
>         +// This is similar to CodeExtractor::severSplitPHINodes
>         +BasicBlock *RegionSimplify::createSingleEntryEdge(Region *R) {
>         +  Function *f = R->getEntry()->getParent();
>         +  if (&f->getEntryBlock() == R->getEntry())
>         +    return NULL; // Entry node is the function's entry blocks
>
>     Why do you need a special case for this? AS the entry node as never
>     any predecessors, I would imaging there would automatically be no
>     splitting.
>
>
> I thought that we should not split regions whose entry is the function
> entry. But I was wrong. We can split those. But I don't quite understand
> what you mean.

LLVM does not allow to have jumps directly to the entry basic block of a 
function. This function has never any predeccessor and consequently does 
not need to be splitted.

>         +
>         +  BasicBlock *oldEntry = R->getEntry();
>         +  PHINode *PN = dyn_cast<PHINode>  (oldEntry->begin());
>         +  if (!PN)
>         +    return NULL; // No PHI nodes.
>         +
>         +  BasicBlock::iterator AfterPHIs = oldEntry->getFirstNonPHI();
>         +  BasicBlock *newEntry = oldEntry->splitBasicBlock(AfterPHIs,
>         +      oldEntry->getName() + ".simregentry");
>
>     'simregionentry' sounds wrong. What about '.single_region_entry',
>     'singleentry', ...?
>
>         +
>         +  // Okay, update dominator sets.
>
>     // Update dominator tree.
>
>         +  if (DominatorTree *DT =
>         getAnalysisIfAvailable<DominatorTree>()) {
>         +    succ_iterator secondSucc = succ_begin(newEntry) + 1;
>         +    if (secondSucc == succ_end(newEntry)) //newEntry has 1
>         successor
>         +      DT->splitBlock(newEntry);
>         +    else { // newEntry has more than 1 successor, update DT
>         manually
>         +      // oldEntry dominates newEntry.
>         +      // newEntry node dominates all other nodes dominated by
>         oldEntry.
>         +      DomTreeNode *OldNode = DT->getNode(oldEntry);
>         +      if (OldNode) { // don't bother if oldEntry doesn't
>         dominates any node
>         +        std::vector<DomTreeNode *>  Children;
>         +        for (DomTreeNode::iterator I = OldNode->begin(), E =
>         OldNode->end(); I
>         +            != E; ++I)
>         +          Children.push_back(*I);
>         +
>         +        DomTreeNode *NewNode = DT->addNewBlock(newEntry, oldEntry);
>         +        for (std::vector<DomTreeNode *>::iterator I =
>         Children.begin(), E =
>         +            Children.end(); I != E; ++I)
>         +          DT->changeImmediateDominator(*I, NewNode);
>         +      }
>         +    }
>         +  }
>         +
>         +  // Loop over all of the predecessors of the old entry that
>         are in the region,
>         +  // changing them to branch to the new entry instead of the
>         old one
>         +  for (pred_iterator PI = pred_begin(oldEntry), PE =
>         pred_end(oldEntry); PI
>         +      != PE; ++PI) {
>         +    if (R->contains(*PI)) {
>         +      TerminatorInst *TI = (*PI)->getTerminator();
>         +      TI->replaceUsesOfWith(oldEntry, newEntry);
>         +    }
>         +  }
>         +  // just have to update the PHI nodes now, inserting PHI nodes
>         into NewBB.
>         +  for (BasicBlock::iterator PI = oldEntry->begin();
>         isa<PHINode>  (PI); ++PI) {
>         +    PHINode *PN = cast<PHINode>  (PI);
>         +    // Create a new PHI node in the new region, which has an
>         incoming value
>         +    // from oldEntry of PN.
>         +    PHINode *NewPN = PHINode::Create(PN->getType(),
>         PN->getName() + ".ph",
>         +        newEntry->begin());
>         +
>         +    NewPN->addIncoming(PN, oldEntry);
>         +
>         +    // Loop over all of the incoming value in PN, moving them
>         to NewPN if they
>         +    // are from the region.
>         +    for (unsigned i = 0; i != PN->getNumIncomingValues(); ++i) {
>         +      BasicBlock *HasValToPN = PN->getIncomingBlock(i);
>         +      if (R->contains(HasValToPN)) {
>         +        NewPN->addIncoming(PN->getIncomingValue(i), HasValToPN);
>         +        PN->removeIncomingValue(i);
>         +        --i;
>         +      }
>         +    }
>         +  }
>
>     Do you think we can use splitBlockPredecessors to simplify all this?
>
> I guess yes. The different here is if we use splitBlockPredecessors,
> Entry is split into NewNode->Entry. The entry node of R is not still Entry.
>
> We only need to update regions whose exit is Entry.
> Their exit must be changed into NewNode.
> I'm not quite sure how to do that.

You are right. I just thought about this again and it seems to be better 
to split Entry->NewNode. We cannot change anything that is outside of 
the region.

>         +  return newEntry;
>         +}
>         +
>         +// createSingleExitEdge - Split the exit BasicBlock of the
>         given region
>         +// to form a single exit edge.
>         +// This does not update RegionInfo analysis.
>         +BasicBlock *RegionSimplify::createSingleExitEdge(Region *R) {
>         +  BasicBlock *BB = R->getExit();
>         +
>         +  SmallVector<BasicBlock*, 4>  Preds;
>         +  for (pred_iterator PI = pred_begin(BB), PE = pred_end(BB); PI
>         != PE; ++PI)
>         +    if (R->contains(*PI))
>         +      Preds.push_back(*PI);
>         +
>         +  return SplitBlockPredecessors(BB, Preds.data(), Preds.size(),
>         ".simregexit",
>         +      this);
>
>     I propose to update RegionInfo here. Copying the setRegionFor from
>     runOnRegion will not be sufficient. You need to update all regions
>     whose entry node was the old exit node of this region.
>
>
> Why would we do that?
> Suppose another region X has entry oldExit.
> In region R, oldExit is split into (NewExit->oldExit).
> R is now the smallest region containing NewExit.
>
> I think it's still OK if the entry of X is oldExit.
Agreed. Forget about the comment.

>         +bool RegionSimplify::runOnRegion(Region *R, RGPassManager&RGM) {
>         +  RegionInfo *RI =&getAnalysis<RegionInfo>  ();
>         +  modified = false;
>         +
>         +  CR = R;
>         +  if (!R->isTopLevelRegion()) {
>         +    BasicBlock *newBB;
>         +    BasicBlock *oldBB;
>         +
>         +    if (!(R->getEnteringBlock())) {
>         +      oldBB = R->getEntry();
>         +
>         +      newBB = createSingleEntryEdge(R);
>         +      if (newBB) { // update RegionInfo only if we split entry
>         successfully
>         +        RI->splitBlock(newBB, oldBB);
>
>     I would put this into createSingleEntryEdge(), as you also update
>     the dominance information there.
>
>
> OK.
>
>
>         +
>         +        modified |= true;
>         +        ++NumEntries;
>         +      }
>         +    }
>         +
>         +    if (!(R->getExitingBlock())) {
>         +      oldBB = R->getExit();
>         +      newBB = createSingleExitEdge(R);
>         +
>         +      RI->setRegionFor(newBB, R);
>
>     I would update the RI in the createSingleExitEdge as you also update
>     the regioninto there.
>
> OK.
>
>
>         +
>         +      modified |= true;
>         +      ++NumExits;
>         +    }
>         +  }
>         +
>         +  return modified;
>         +}
>         diff --git a/lib/Transforms/Scalar/Scalar.cpp
>         b/lib/Transforms/Scalar/Scalar.cpp
>         index bf9ca6d..5d18f22 100644
>         --- a/lib/Transforms/Scalar/Scalar.cpp
>         +++ b/lib/Transforms/Scalar/Scalar.cpp
>         @@ -51,6 +51,7 @@ void
>         llvm::initializeScalarOpts(PassRegistry&Registry) {
>             initializeMemCpyOptPass(Registry);
>             initializeReassociatePass(Registry);
>             initializeRegToMemPass(Registry);
>         +  initializeRegionSimplifyPass(Registry);
>             initializeSCCPPass(Registry);
>             initializeIPSCCPPass(Registry);
>             initializeSROA_DTPass(Registry);
>
>
>     Furthermore, you should add a couple of test cases for the different
>     branches.
>
> I tested with mysql code. Do you know how to verify that our pass
> preserves the semantics of the program?

You can compile some code in the LLVM test-suite with it. However, I 
believe it is fine if you generate a bunch of test cases for which you 
make sure it is doing the right thing.

Cheers
Tobi