Hi Tobias,<div>I did make some changes to use <meta http-equiv="content-type" content="text/html; charset=utf-8">splitBlockPredecessors in splitting entry nodes.</div><div>I tested it with coreutils and MySQL together with region-extractor pass. </div>

<div>If you're still interested in region-extractor pass, I'll make a patch.</div><div>Vu<br><br><div class="gmail_quote">On Mon, Feb 28, 2011 at 8:35 AM, Tobias Grosser <span dir="ltr"><<a href="mailto:grosser@fim.uni-passau.de">grosser@fim.uni-passau.de</a>></span> wrote:<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">On 02/01/2011 07:37 PM, Vu Le wrote:<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

Hi Tobias,<br>

</blockquote>

<br>

I would like to get this one in LLVM 2.9. Do you happen to have an updated patch available?<br>

<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

        +void RegionSimplify::print(raw_ostream&O, const Module *M) const {<br>

        +  BasicBlock *enteringBlock;<br>

        +  BasicBlock *exitingBlock;<br>

        +<br>

        +  if (modified) {<br>

        +    enteringBlock = CR->getEnteringBlock();<br>

        +    exitingBlock = CR->getExitingBlock();<br>

        +<br>

        +    O<< "\nRegion: ["<<  CR->getNameStr()<< "] Edges:\n";<br>

        +    if (enteringBlock)<br>

        +      O<< "  Entry: ]"<<  enteringBlock->getNameStr()<< " => "<br>

        +<<  enteringBlock->getNameStr()<< "]\n";<br>

        +    if (exitingBlock)<br>

        +      O<< "  Exit: ["<<  exitingBlock->getNameStr()<< " => "<br>

        +<<  exitingBlock->getNameStr()<< "[\n";<br>

<br>

    Can you use CR->getNameStr() instead of formatting this yourself?<br>

<br>

<br>

This is is Andreas' code. Do you mean if the region is modified, we just<br>

print CR->getNameStr()?<br>

</blockquote>

<br>

OK. I did not get that you add additional information. This is fine. I would just format it like this:<br>

<br>

void RegionSimplify::print(raw_ostream &O, const Module *M) const {<br>

  if (!modified)<br>

    return;<br>

<br>

<br>

  BasicBlock *enteringBlock = CR->getEnteringBlock();<br>

  BasicBlock *exitingBlock = CR->getExitingBlock();<br>

<br>

[...]<br>

<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

        +<br>

        +    O<< "\n";<br>

        +  }<br>

        +}<br>

        +<br>

        +void RegionSimplify::getAnalysisUsage(AnalysisUsage&AU) const {<br>

        +  AU.addPreserved<DominatorTree>  ();<br>

        +  AU.addRequired<RegionInfo>  ();<br>

<br>

    Does this transformation preserve RegionInfo?<br>

<br>

<br>

Yes, I think RegionInfo is also preserved.<br>

</blockquote>

OK. So let's state that we preserve it.<br>

<br>

<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

        +// createSingleEntryEdge - Split the entry BasicBlock of the given<br>

        +// region after the last PHINode to form a single entry edge.<br>

        +// This is similar to CodeExtractor::severSplitPHINodes<br>

        +BasicBlock *RegionSimplify::createSingleEntryEdge(Region *R) {<br>

        +  Function *f = R->getEntry()->getParent();<br>

        +  if (&f->getEntryBlock() == R->getEntry())<br>

        +    return NULL; // Entry node is the function's entry blocks<br>

<br>

    Why do you need a special case for this? AS the entry node as never<br>

    any predecessors, I would imaging there would automatically be no<br>

    splitting.<br>

<br>

<br>

I thought that we should not split regions whose entry is the function<br>

entry. But I was wrong. We can split those. But I don't quite understand<br>

what you mean.<br>

</blockquote>

<br>

LLVM does not allow to have jumps directly to the entry basic block of a function. This function has never any predeccessor and consequently does not need to be splitted.<br>

<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

        +<br>

        +  BasicBlock *oldEntry = R->getEntry();<br>

        +  PHINode *PN = dyn_cast<PHINode>  (oldEntry->begin());<br>

        +  if (!PN)<br>

        +    return NULL; // No PHI nodes.<br>

        +<br>

        +  BasicBlock::iterator AfterPHIs = oldEntry->getFirstNonPHI();<br>

        +  BasicBlock *newEntry = oldEntry->splitBasicBlock(AfterPHIs,<br>

        +      oldEntry->getName() + ".simregentry");<br>

<br>

    'simregionentry' sounds wrong. What about '.single_region_entry',<br>

    'singleentry', ...?<br>

<br>

        +<br>

        +  // Okay, update dominator sets.<br>

<br>

    // Update dominator tree.<br>

<br>

        +  if (DominatorTree *DT =<br>

        getAnalysisIfAvailable<DominatorTree>()) {<br>

        +    succ_iterator secondSucc = succ_begin(newEntry) + 1;<br>

        +    if (secondSucc == succ_end(newEntry)) //newEntry has 1<br>

        successor<br>

        +      DT->splitBlock(newEntry);<br>

        +    else { // newEntry has more than 1 successor, update DT<br>

        manually<br>

        +      // oldEntry dominates newEntry.<br>

        +      // newEntry node dominates all other nodes dominated by<br>

        oldEntry.<br>

        +      DomTreeNode *OldNode = DT->getNode(oldEntry);<br>

        +      if (OldNode) { // don't bother if oldEntry doesn't<br>

        dominates any node<br>

        +        std::vector<DomTreeNode *>  Children;<br>

        +        for (DomTreeNode::iterator I = OldNode->begin(), E =<br>

        OldNode->end(); I<br>

        +            != E; ++I)<br>

        +          Children.push_back(*I);<br>

        +<br>

        +        DomTreeNode *NewNode = DT->addNewBlock(newEntry, oldEntry);<br>

        +        for (std::vector<DomTreeNode *>::iterator I =<br>

        Children.begin(), E =<br>

        +            Children.end(); I != E; ++I)<br>

        +          DT->changeImmediateDominator(*I, NewNode);<br>

        +      }<br>

        +    }<br>

        +  }<br>

        +<br>

        +  // Loop over all of the predecessors of the old entry that<br>

        are in the region,<br>

        +  // changing them to branch to the new entry instead of the<br>

        old one<br>

        +  for (pred_iterator PI = pred_begin(oldEntry), PE =<br>

        pred_end(oldEntry); PI<br>

        +      != PE; ++PI) {<br>

        +    if (R->contains(*PI)) {<br>

        +      TerminatorInst *TI = (*PI)->getTerminator();<br>

        +      TI->replaceUsesOfWith(oldEntry, newEntry);<br>

        +    }<br>

        +  }<br>

        +  // just have to update the PHI nodes now, inserting PHI nodes<br>

        into NewBB.<br>

        +  for (BasicBlock::iterator PI = oldEntry->begin();<br>

        isa<PHINode>  (PI); ++PI) {<br>

        +    PHINode *PN = cast<PHINode>  (PI);<br>

        +    // Create a new PHI node in the new region, which has an<br>

        incoming value<br>

        +    // from oldEntry of PN.<br>

        +    PHINode *NewPN = PHINode::Create(PN->getType(),<br>

        PN->getName() + ".ph",<br>

        +        newEntry->begin());<br>

        +<br>

        +    NewPN->addIncoming(PN, oldEntry);<br>

        +<br>

        +    // Loop over all of the incoming value in PN, moving them<br>

        to NewPN if they<br>

        +    // are from the region.<br>

        +    for (unsigned i = 0; i != PN->getNumIncomingValues(); ++i) {<br>

        +      BasicBlock *HasValToPN = PN->getIncomingBlock(i);<br>

        +      if (R->contains(HasValToPN)) {<br>

        +        NewPN->addIncoming(PN->getIncomingValue(i), HasValToPN);<br>

        +        PN->removeIncomingValue(i);<br>

        +        --i;<br>

        +      }<br>

        +    }<br>

        +  }<br>

<br>

    Do you think we can use splitBlockPredecessors to simplify all this?<br>

<br>

I guess yes. The different here is if we use splitBlockPredecessors,<br>

Entry is split into NewNode->Entry. The entry node of R is not still Entry.<br>

<br>

We only need to update regions whose exit is Entry.<br>

Their exit must be changed into NewNode.<br>

I'm not quite sure how to do that.<br>

</blockquote>

<br>

You are right. I just thought about this again and it seems to be better to split Entry->NewNode. We cannot change anything that is outside of the region.<br>

<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

        +  return newEntry;<br>

        +}<br>

        +<br>

        +// createSingleExitEdge - Split the exit BasicBlock of the<br>

        given region<br>

        +// to form a single exit edge.<br>

        +// This does not update RegionInfo analysis.<br>

        +BasicBlock *RegionSimplify::createSingleExitEdge(Region *R) {<br>

        +  BasicBlock *BB = R->getExit();<br>

        +<br>

        +  SmallVector<BasicBlock*, 4>  Preds;<br>

        +  for (pred_iterator PI = pred_begin(BB), PE = pred_end(BB); PI<br>

        != PE; ++PI)<br>

        +    if (R->contains(*PI))<br>

        +      Preds.push_back(*PI);<br>

        +<br>

        +  return SplitBlockPredecessors(BB, Preds.data(), Preds.size(),<br>

        ".simregexit",<br>

        +      this);<br>

<br>

    I propose to update RegionInfo here. Copying the setRegionFor from<br>

    runOnRegion will not be sufficient. You need to update all regions<br>

    whose entry node was the old exit node of this region.<br>

<br>

<br>

Why would we do that?<br>

Suppose another region X has entry oldExit.<br>

In region R, oldExit is split into (NewExit->oldExit).<br>

R is now the smallest region containing NewExit.<br>

<br>

I think it's still OK if the entry of X is oldExit.<br>

</blockquote>

Agreed. Forget about the comment.<br>

<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

        +bool RegionSimplify::runOnRegion(Region *R, RGPassManager&RGM) {<br>

        +  RegionInfo *RI =&getAnalysis<RegionInfo>  ();<br>

        +  modified = false;<br>

        +<br>

        +  CR = R;<br>

        +  if (!R->isTopLevelRegion()) {<br>

        +    BasicBlock *newBB;<br>

        +    BasicBlock *oldBB;<br>

        +<br>

        +    if (!(R->getEnteringBlock())) {<br>

        +      oldBB = R->getEntry();<br>

        +<br>

        +      newBB = createSingleEntryEdge(R);<br>

        +      if (newBB) { // update RegionInfo only if we split entry<br>

        successfully<br>

        +        RI->splitBlock(newBB, oldBB);<br>

<br>

    I would put this into createSingleEntryEdge(), as you also update<br>

    the dominance information there.<br>

<br>

<br>

OK.<br>

<br>

<br>

        +<br>

        +        modified |= true;<br>

        +        ++NumEntries;<br>

        +      }<br>

        +    }<br>

        +<br>

        +    if (!(R->getExitingBlock())) {<br>

        +      oldBB = R->getExit();<br>

        +      newBB = createSingleExitEdge(R);<br>

        +<br>

        +      RI->setRegionFor(newBB, R);<br>

<br>

    I would update the RI in the createSingleExitEdge as you also update<br>

    the regioninto there.<br>

<br>

OK.<br>

<br>

<br>

        +<br>

        +      modified |= true;<br>

        +      ++NumExits;<br>

        +    }<br>

        +  }<br>

        +<br>

        +  return modified;<br>

        +}<br>

        diff --git a/lib/Transforms/Scalar/Scalar.cpp<br>

        b/lib/Transforms/Scalar/Scalar.cpp<br>

        index bf9ca6d..5d18f22 100644<br>

        --- a/lib/Transforms/Scalar/Scalar.cpp<br>

        +++ b/lib/Transforms/Scalar/Scalar.cpp<br>

        @@ -51,6 +51,7 @@ void<br>

        llvm::initializeScalarOpts(PassRegistry&Registry) {<br>

            initializeMemCpyOptPass(Registry);<br>

            initializeReassociatePass(Registry);<br>

            initializeRegToMemPass(Registry);<br>

        +  initializeRegionSimplifyPass(Registry);<br>

            initializeSCCPPass(Registry);<br>

            initializeIPSCCPPass(Registry);<br>

            initializeSROA_DTPass(Registry);<br>

<br>

<br>

    Furthermore, you should add a couple of test cases for the different<br>

    branches.<br>

<br>

I tested with mysql code. Do you know how to verify that our pass<br>

preserves the semantics of the program?<br>

</blockquote>

<br>

You can compile some code in the LLVM test-suite with it. However, I believe it is fine if you generate a bunch of test cases for which you make sure it is doing the right thing.<br>

<br>

Cheers<br>

Tobi<br>

</blockquote></div><br></div>