[llvm] r207713 - Patch for function cloning to inline all blocks whose address is taken

Reid Kleckner rnk at google.com
Thu May 1 12:48:44 PDT 2014


Can we transform indirectbr into a chain of comparisons and branches
against the old BB labels on inlining?  This is probably not profitable,
but at least it maintains the correctness of the program while honoring
alwaysinline.


On Thu, May 1, 2014 at 11:58 AM, Gerolf Hoflehner <ghoflehner at apple.com>wrote:

> We had some discussion on this mailing list about the always_inline
> attribute and converged
> at the semantics “always inline when it safe to do so”. In the test
> case below it is safe to do, and it would expose the bug this patch is
> taking care
> of, but escape analysis is not ready yet to let inline happen. Until
> then the test case is sitting on the bench.
>
> At one point I had suggested enabling it with an user assertion (option!)
> like “I tell you always
> inline is safe”, but got hammered for it. :-) If there is a change of mind
> and
> the urge for an immediate test case now overrules the concern about an
> extra option,
> I will pursue that route again.
>
> Cheers
> Gerolf
>
> Test case:
>
> #include <stdio.h>
> extern int foo(int);
> extern int gv;
>
> __attribute__((always_inline)) int static bar(int v)
> {
>     void *goToNext;
>
>     goToNext = &&DecrementV; //local
>     goto DecrementV;
>
> DecrementV:
>     if (v == 1) {
>         goToNext = &&ByeBye;  //local
>     }
>     v--;
>     goto *goToNext; // can only jump to local label.
>
> ByeBye:
>     return foo(v);
> }
>
>
>
> int main() { int res;  res=bar(gv);  return res; }
>
>
> --- test/Transforms/Inline/inline-all-address-taken-blocks.ll (revision 0)
> +++ test/Transforms/Inline/inline-all-address-taken-blocks.ll (working
> copy)
> @@ -0,0 +1,39 @@
> +; RUN: opt < %s -O3 -S | FileCheck %s
> + at gv = external global i32
> +
> +define i32 @main() nounwind {
> +entry:
> +; CHECK-NOT: call i32 @bar
> +; CHECK: call i32 @foo
> +  %0 = load i32* @gv, align 4
> +  %call = tail call i32 @bar(i32 %0)
> +  ret i32 %call
> +}
> +
> +define internal i32 @bar(i32 %v) alwaysinline {
> +entry:
> +  br label %DecrementV
> +
> +DecrementV:                                       ; preds = %entry,
> %DecrementV
> +  %v.addr.0 = phi i32 [ %v, %entry ], [ %dec, %DecrementV ]
> +  %goToNext.0 = phi i8* [ blockaddress(@bar, %DecrementV), %entry ], [
> %.goToNext.0, %DecrementV ]
> +  %cmp = icmp eq i32 %v.addr.0, 1
> +  %.goToNext.0 = select i1 %cmp, i8* blockaddress(@bar, %ByeBye), i8*
> %goToNext.0
> +  %dec = add nsw i32 %v.addr.0, -1
> +  indirectbr i8* %.goToNext.0, [label %DecrementV, label %ByeBye]
> +
> +ByeBye:                                           ; preds = %DecrementV
> +  %call = tail call i32 @foo(i32 %dec) #3
> +  ret i32 %call
> +}
> +
> +declare i32 @foo(i32) nounwind
>
> On May 1, 2014, at 11:03 AM, Eric Christopher <echristo at gmail.com> wrote:
>
> It's not particularly possible to test it. I had the discussion with
> Gerolf there a bit. It mostly relies on being able to inline things
> that right now we wouldn't be able to without better escape analysis.
> I think the bug is real which is why I finally said ok. :\
>
> -eric
>
> On Wed, Apr 30, 2014 at 9:11 PM, Rafael Espíndola
> <rafael.espindola at gmail.com> wrote:
>
> is it possible to test this?
>
> On 30 April 2014 18:05, Gerolf Hoflehner <ghoflehner at apple.com> wrote:
>
> Author: ghoflehner
> Date: Wed Apr 30 17:05:02 2014
> New Revision: 207713
>
> URL: http://llvm.org/viewvc/llvm-project?rev=207713&view=rev
> Log:
> Patch for function cloning to inline all blocks whose address is taken
>
> Not all address taken blocks get inlined. The reason is
> that a blocks new address is known only when it is cloned. But e.g.
> a branch instruction in a different block could need that address earlier
> while it gets cloned. The solution is to collect the set of all
> blocks that can potentially get inlined and compute a new block address
> up front. Then clone and cleanup.
>
> rdar://16427209
>
> Modified:
>    llvm/trunk/lib/Transforms/Utils/CloneFunction.cpp
>
> Modified: llvm/trunk/lib/Transforms/Utils/CloneFunction.cpp
> URL:
> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Utils/CloneFunction.cpp?rev=207713&r1=207712&r2=207713&view=diff
>
> ==============================================================================
> --- llvm/trunk/lib/Transforms/Utils/CloneFunction.cpp (original)
> +++ llvm/trunk/lib/Transforms/Utils/CloneFunction.cpp Wed Apr 30 17:05:02
> 2014
> @@ -32,6 +32,7 @@
> #include "llvm/Transforms/Utils/Local.h"
> #include "llvm/Transforms/Utils/ValueMapper.h"
> #include <map>
> +#include <set>
> using namespace llvm;
>
> // CloneBasicBlock - See comments in Cloning.h
> @@ -272,42 +273,36 @@ namespace {
>       NameSuffix(nameSuffix), CodeInfo(codeInfo), DL(DL) {
>     }
>
> -    /// CloneBlock - The specified block is found to be reachable, clone
> it and
> -    /// anything that it can reach.
> +    /// CloneBlock - The specified block is found to be reachable, so
> clone it
> +    /// into NewBB.
> +    /// ToClone is the vector of actually cloned blocks.
> +    /// OrigBBs is the set of all blocks reacheable from the entry block.
> +    /// It contains the block candidates and makes sure each block
> +    /// is cloned at most once.
>     void CloneBlock(const BasicBlock *BB,
> -                    std::vector<const BasicBlock*> &ToClone);
> +                    BasicBlock *NewBB,
> +                    std::vector<const BasicBlock *> &ToClone,
> +                    std::set<const BasicBlock *> &OrigBBs);
>   };
> }
>
> -/// CloneBlock - The specified block is found to be reachable, clone it
> and
> -/// anything that it can reach.
> +/// CloneBlock - The specified block is found to be reachable, so clone it
> +/// into NewBB.
> +/// ToClone is the vector of actually cloned blocks.
> +/// OrigBBs is the set of all blocks reacheable from the entry block.
> +/// It contains the block candidates and makes sure each block
> +/// is cloned at most once.
> void PruningFunctionCloner::CloneBlock(const BasicBlock *BB,
> -                                       std::vector<const BasicBlock*>
> &ToClone){
> -  WeakVH &BBEntry = VMap[BB];
> -
> -  // Have we already cloned this block?
> -  if (BBEntry) return;
> +                                       BasicBlock *NewBB,
> +                                       std::vector<const BasicBlock *>
> &ToClone,
> +                                       std::set<const BasicBlock *>
> &OrigBBs) {
>
> -  // Nope, clone it now.
> -  BasicBlock *NewBB;
> -  BBEntry = NewBB = BasicBlock::Create(BB->getContext());
> -  if (BB->hasName()) NewBB->setName(BB->getName()+NameSuffix);
> +  // Remove BB from list of blocks to clone.
> +  // When it was not in the list, it has been cloned already, so
> +  // don't clone again.
> +  if (!OrigBBs.erase(BB)) return;
>
> -  // It is only legal to clone a function if a block address within that
> -  // function is never referenced outside of the function.  Given that, we
> -  // want to map block addresses from the old function to block addresses
> in
> -  // the clone. (This is different from the generic ValueMapper
> -  // implementation, which generates an invalid blockaddress when
> -  // cloning a function.)
> -  //
> -  // Note that we don't need to fix the mapping for unreachable blocks;
> -  // the default mapping there is safe.
> -  if (BB->hasAddressTaken()) {
> -    Constant *OldBBAddr =
> BlockAddress::get(const_cast<Function*>(OldFunc),
> -                                            const_cast<BasicBlock*>(BB));
> -    VMap[OldBBAddr] = BlockAddress::get(NewFunc, NewBB);
> -  }
> -
> +  // Nope, clone it now.
>
>   bool hasCalls = false, hasDynamicAllocas = false, hasStaticAllocas =
> false;
>
> @@ -425,7 +420,7 @@ void llvm::CloneAndPruneFunctionInto(Fun
>                                      const DataLayout *DL,
>                                      Instruction *TheCall) {
>   assert(NameSuffix && "NameSuffix cannot be null!");
> -
> +
> #ifndef NDEBUG
>   for (Function::const_arg_iterator II = OldFunc->arg_begin(),
>        E = OldFunc->arg_end(); II != E; ++II)
> @@ -435,15 +430,91 @@ void llvm::CloneAndPruneFunctionInto(Fun
>   PruningFunctionCloner PFC(NewFunc, OldFunc, VMap, ModuleLevelChanges,
>                             NameSuffix, CodeInfo, DL);
>
> -  // Clone the entry block, and anything recursively reachable from it.
> +  // Since all BB address references need to be known before
> block-by-block
> +  // processing, we need to create all reachable blocks before processing
> +  // them for instruction cloning and pruning. Some of these blocks may
> +  // be removed due to later pruning.
>   std::vector<const BasicBlock*> CloneWorklist;
> +  //
> +  // OrigBBs consists of all blocks reachable from the entry
> +  // block.
> +  // This list will be pruned down by the CloneFunction() due to two
> +  // two optimizations:
> +  // First, when a conditional branch target is known at compile-time,
> +  // only the actual branch destination block needs to be cloned.
> +  // Second, when a switch statement target is known at compile-time,
> +  // only the actual case statement needs to be cloned.
> +  std::set<const BasicBlock *> OrigBBs;
> +
> +  CloneWorklist.push_back(&OldFunc->getEntryBlock());
> +  while (!CloneWorklist.empty()) {
> +    const BasicBlock *BB = CloneWorklist.back();
> +    CloneWorklist.pop_back();
> +
> +    // Don't revisit blocks.
> +    if (VMap.count(BB))
> +      continue;
> +
> +    BasicBlock *NewBB = BasicBlock::Create(BB->getContext());
> +    if (BB->hasName())
> +      NewBB->setName(BB->getName() + NameSuffix);
> +
> +    // It is legal to clone a function when a block address within that
> +    // function is never escapes outside of the function.  Given that, we
> +    // want to map block addresses from the old function to block
> addresses in
> +    // the clone. (This is different from the generic ValueMapper
> +    // implementation, which generates an invalid block address when
> +    // cloning a function.)
> +    // Note the current escape address does not catch all legal cases:
> even
> +    // when all block addresses taken are local and the function has the
> +    // always_inline attribute due to the indirect branch inlining is
> +    // suppressed.
> +    // Note that we don't need to fix the mapping for unreachable blocks;
> +    // the default mapping there is safe.
> +    if (BB->hasAddressTaken()) {
> +      Constant *OldBBAddr =
> BlockAddress::get(const_cast<Function*>(OldFunc),
> +
>                                              const_cast<BasicBlock*>(BB));
> +      VMap[OldBBAddr] = BlockAddress::get(NewFunc, NewBB);
> +    }
> +
> +    OrigBBs.insert(BB);
> +    VMap[BB] = NewBB;
> +    // Iterate over all possible successors and add them to the
> CloneWorklist.
> +    const TerminatorInst *Term = BB->getTerminator();
> +    for (unsigned i = 0, e = Term->getNumSuccessors(); i != e; ++i) {
> +      BasicBlock *Succ = Term->getSuccessor(i);
> +      CloneWorklist.push_back(Succ);
> +    }
> +  }
> +
> +  // Now, fill only the reachable blocks with the cloned contents
> +  // of the originals.
> +  assert(CloneWorklist.empty() && "Dirty worklist before re-use\n");
>   CloneWorklist.push_back(&OldFunc->getEntryBlock());
>   while (!CloneWorklist.empty()) {
>     const BasicBlock *BB = CloneWorklist.back();
>     CloneWorklist.pop_back();
> -    PFC.CloneBlock(BB, CloneWorklist);
> +    PFC.CloneBlock(BB, cast<BasicBlock>(VMap[BB]), CloneWorklist,
> +                   OrigBBs);
>   }
> -
> +
> +  // FIXME: Delete BB's that were created but have been pruned.
> +  // Actual cloning may have found pruning opportunities since
> +  // branch or switch statement target may have been known at
> compile-time.
> +  // Alternatively we could write a routine CloneFunction and add a) a
> +  // parameter to actually do the cloning and b) a return parameter that
> +  // gives a list of blocks that need to be cloned also. Then we could
> +  // call CloneFunction when we collect the blocks to clone, but suppress
> +  // cloning. And then actually *do* the cloning in the while loop above.
> Then
> +  // the cleanup here would become redundant, and so would be the OrigBBs.
> +  for (std::set<const BasicBlock *>::iterator Oi = OrigBBs.begin(),
> +       Oe = OrigBBs.end(); Oi != Oe; ++Oi) {
> +    const BasicBlock *Orig = *Oi;
> +    BasicBlock *NewBB = cast<BasicBlock>(VMap[Orig]);
> +    delete NewBB;
> +    VMap[Orig] = 0;
> +  }
> +
>   // Loop over all of the basic blocks in the old function.  If the block
> was
>   // reachable, we have cloned it and the old block is now in the value
> map:
>   // insert it into the new function in the right order.  If not, ignore
> it.
> @@ -454,7 +525,8 @@ void llvm::CloneAndPruneFunctionInto(Fun
>        BI != BE; ++BI) {
>     Value *V = VMap[BI];
>     BasicBlock *NewBB = cast_or_null<BasicBlock>(V);
> -    if (!NewBB) continue;  // Dead block.
> +    if (!NewBB)
> +      continue; // Dead block.
>
>     // Add the new block to the new function.
>     NewFunc->getBasicBlockList().push_back(NewBB);
>
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>
>
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20140501/3d35fbc0/attachment.html>


More information about the llvm-commits mailing list