[llvm] r207713 - Patch for function cloning to inline all blocks whose address is taken
Gerolf Hoflehner
ghoflehner at apple.com
Thu May 1 11:58:34 PDT 2014
We had some discussion on this mailing list about the always_inline attribute and converged
at the semantics “always inline when it safe to do so”. In the test
case below it is safe to do, and it would expose the bug this patch is taking care
of, but escape analysis is not ready yet to let inline happen. Until
then the test case is sitting on the bench.
At one point I had suggested enabling it with an user assertion (option!) like “I tell you always
inline is safe”, but got hammered for it. :-) If there is a change of mind and
the urge for an immediate test case now overrules the concern about an extra option,
I will pursue that route again.
Cheers
Gerolf
Test case:
#include <stdio.h>
extern int foo(int);
extern int gv;
__attribute__((always_inline)) int static bar(int v)
{
void *goToNext;
goToNext = &&DecrementV; //local
goto DecrementV;
DecrementV:
if (v == 1) {
goToNext = &&ByeBye; //local
}
v--;
goto *goToNext; // can only jump to local label.
ByeBye:
return foo(v);
}
int main() { int res; res=bar(gv); return res; }
--- test/Transforms/Inline/inline-all-address-taken-blocks.ll (revision 0)
+++ test/Transforms/Inline/inline-all-address-taken-blocks.ll (working copy)
@@ -0,0 +1,39 @@
+; RUN: opt < %s -O3 -S | FileCheck %s
+ at gv = external global i32
+
+define i32 @main() nounwind {
+entry:
+; CHECK-NOT: call i32 @bar
+; CHECK: call i32 @foo
+ %0 = load i32* @gv, align 4
+ %call = tail call i32 @bar(i32 %0)
+ ret i32 %call
+}
+
+define internal i32 @bar(i32 %v) alwaysinline {
+entry:
+ br label %DecrementV
+
+DecrementV: ; preds = %entry, %DecrementV
+ %v.addr.0 = phi i32 [ %v, %entry ], [ %dec, %DecrementV ]
+ %goToNext.0 = phi i8* [ blockaddress(@bar, %DecrementV), %entry ], [ %.goToNext.0, %DecrementV ]
+ %cmp = icmp eq i32 %v.addr.0, 1
+ %.goToNext.0 = select i1 %cmp, i8* blockaddress(@bar, %ByeBye), i8* %goToNext.0
+ %dec = add nsw i32 %v.addr.0, -1
+ indirectbr i8* %.goToNext.0, [label %DecrementV, label %ByeBye]
+
+ByeBye: ; preds = %DecrementV
+ %call = tail call i32 @foo(i32 %dec) #3
+ ret i32 %call
+}
+
+declare i32 @foo(i32) nounwind
On May 1, 2014, at 11:03 AM, Eric Christopher <echristo at gmail.com> wrote:
> It's not particularly possible to test it. I had the discussion with
> Gerolf there a bit. It mostly relies on being able to inline things
> that right now we wouldn't be able to without better escape analysis.
> I think the bug is real which is why I finally said ok. :\
>
> -eric
>
> On Wed, Apr 30, 2014 at 9:11 PM, Rafael Espíndola
> <rafael.espindola at gmail.com> wrote:
>> is it possible to test this?
>>
>> On 30 April 2014 18:05, Gerolf Hoflehner <ghoflehner at apple.com> wrote:
>>> Author: ghoflehner
>>> Date: Wed Apr 30 17:05:02 2014
>>> New Revision: 207713
>>>
>>> URL: http://llvm.org/viewvc/llvm-project?rev=207713&view=rev
>>> Log:
>>> Patch for function cloning to inline all blocks whose address is taken
>>>
>>> Not all address taken blocks get inlined. The reason is
>>> that a blocks new address is known only when it is cloned. But e.g.
>>> a branch instruction in a different block could need that address earlier
>>> while it gets cloned. The solution is to collect the set of all
>>> blocks that can potentially get inlined and compute a new block address
>>> up front. Then clone and cleanup.
>>>
>>> rdar://16427209
>>>
>>> Modified:
>>> llvm/trunk/lib/Transforms/Utils/CloneFunction.cpp
>>>
>>> Modified: llvm/trunk/lib/Transforms/Utils/CloneFunction.cpp
>>> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Utils/CloneFunction.cpp?rev=207713&r1=207712&r2=207713&view=diff
>>> ==============================================================================
>>> --- llvm/trunk/lib/Transforms/Utils/CloneFunction.cpp (original)
>>> +++ llvm/trunk/lib/Transforms/Utils/CloneFunction.cpp Wed Apr 30 17:05:02 2014
>>> @@ -32,6 +32,7 @@
>>> #include "llvm/Transforms/Utils/Local.h"
>>> #include "llvm/Transforms/Utils/ValueMapper.h"
>>> #include <map>
>>> +#include <set>
>>> using namespace llvm;
>>>
>>> // CloneBasicBlock - See comments in Cloning.h
>>> @@ -272,42 +273,36 @@ namespace {
>>> NameSuffix(nameSuffix), CodeInfo(codeInfo), DL(DL) {
>>> }
>>>
>>> - /// CloneBlock - The specified block is found to be reachable, clone it and
>>> - /// anything that it can reach.
>>> + /// CloneBlock - The specified block is found to be reachable, so clone it
>>> + /// into NewBB.
>>> + /// ToClone is the vector of actually cloned blocks.
>>> + /// OrigBBs is the set of all blocks reacheable from the entry block.
>>> + /// It contains the block candidates and makes sure each block
>>> + /// is cloned at most once.
>>> void CloneBlock(const BasicBlock *BB,
>>> - std::vector<const BasicBlock*> &ToClone);
>>> + BasicBlock *NewBB,
>>> + std::vector<const BasicBlock *> &ToClone,
>>> + std::set<const BasicBlock *> &OrigBBs);
>>> };
>>> }
>>>
>>> -/// CloneBlock - The specified block is found to be reachable, clone it and
>>> -/// anything that it can reach.
>>> +/// CloneBlock - The specified block is found to be reachable, so clone it
>>> +/// into NewBB.
>>> +/// ToClone is the vector of actually cloned blocks.
>>> +/// OrigBBs is the set of all blocks reacheable from the entry block.
>>> +/// It contains the block candidates and makes sure each block
>>> +/// is cloned at most once.
>>> void PruningFunctionCloner::CloneBlock(const BasicBlock *BB,
>>> - std::vector<const BasicBlock*> &ToClone){
>>> - WeakVH &BBEntry = VMap[BB];
>>> -
>>> - // Have we already cloned this block?
>>> - if (BBEntry) return;
>>> + BasicBlock *NewBB,
>>> + std::vector<const BasicBlock *> &ToClone,
>>> + std::set<const BasicBlock *> &OrigBBs) {
>>>
>>> - // Nope, clone it now.
>>> - BasicBlock *NewBB;
>>> - BBEntry = NewBB = BasicBlock::Create(BB->getContext());
>>> - if (BB->hasName()) NewBB->setName(BB->getName()+NameSuffix);
>>> + // Remove BB from list of blocks to clone.
>>> + // When it was not in the list, it has been cloned already, so
>>> + // don't clone again.
>>> + if (!OrigBBs.erase(BB)) return;
>>>
>>> - // It is only legal to clone a function if a block address within that
>>> - // function is never referenced outside of the function. Given that, we
>>> - // want to map block addresses from the old function to block addresses in
>>> - // the clone. (This is different from the generic ValueMapper
>>> - // implementation, which generates an invalid blockaddress when
>>> - // cloning a function.)
>>> - //
>>> - // Note that we don't need to fix the mapping for unreachable blocks;
>>> - // the default mapping there is safe.
>>> - if (BB->hasAddressTaken()) {
>>> - Constant *OldBBAddr = BlockAddress::get(const_cast<Function*>(OldFunc),
>>> - const_cast<BasicBlock*>(BB));
>>> - VMap[OldBBAddr] = BlockAddress::get(NewFunc, NewBB);
>>> - }
>>> -
>>> + // Nope, clone it now.
>>>
>>> bool hasCalls = false, hasDynamicAllocas = false, hasStaticAllocas = false;
>>>
>>> @@ -425,7 +420,7 @@ void llvm::CloneAndPruneFunctionInto(Fun
>>> const DataLayout *DL,
>>> Instruction *TheCall) {
>>> assert(NameSuffix && "NameSuffix cannot be null!");
>>> -
>>> +
>>> #ifndef NDEBUG
>>> for (Function::const_arg_iterator II = OldFunc->arg_begin(),
>>> E = OldFunc->arg_end(); II != E; ++II)
>>> @@ -435,15 +430,91 @@ void llvm::CloneAndPruneFunctionInto(Fun
>>> PruningFunctionCloner PFC(NewFunc, OldFunc, VMap, ModuleLevelChanges,
>>> NameSuffix, CodeInfo, DL);
>>>
>>> - // Clone the entry block, and anything recursively reachable from it.
>>> + // Since all BB address references need to be known before block-by-block
>>> + // processing, we need to create all reachable blocks before processing
>>> + // them for instruction cloning and pruning. Some of these blocks may
>>> + // be removed due to later pruning.
>>> std::vector<const BasicBlock*> CloneWorklist;
>>> + //
>>> + // OrigBBs consists of all blocks reachable from the entry
>>> + // block.
>>> + // This list will be pruned down by the CloneFunction() due to two
>>> + // two optimizations:
>>> + // First, when a conditional branch target is known at compile-time,
>>> + // only the actual branch destination block needs to be cloned.
>>> + // Second, when a switch statement target is known at compile-time,
>>> + // only the actual case statement needs to be cloned.
>>> + std::set<const BasicBlock *> OrigBBs;
>>> +
>>> + CloneWorklist.push_back(&OldFunc->getEntryBlock());
>>> + while (!CloneWorklist.empty()) {
>>> + const BasicBlock *BB = CloneWorklist.back();
>>> + CloneWorklist.pop_back();
>>> +
>>> + // Don't revisit blocks.
>>> + if (VMap.count(BB))
>>> + continue;
>>> +
>>> + BasicBlock *NewBB = BasicBlock::Create(BB->getContext());
>>> + if (BB->hasName())
>>> + NewBB->setName(BB->getName() + NameSuffix);
>>> +
>>> + // It is legal to clone a function when a block address within that
>>> + // function is never escapes outside of the function. Given that, we
>>> + // want to map block addresses from the old function to block addresses in
>>> + // the clone. (This is different from the generic ValueMapper
>>> + // implementation, which generates an invalid block address when
>>> + // cloning a function.)
>>> + // Note the current escape address does not catch all legal cases: even
>>> + // when all block addresses taken are local and the function has the
>>> + // always_inline attribute due to the indirect branch inlining is
>>> + // suppressed.
>>> + // Note that we don't need to fix the mapping for unreachable blocks;
>>> + // the default mapping there is safe.
>>> + if (BB->hasAddressTaken()) {
>>> + Constant *OldBBAddr = BlockAddress::get(const_cast<Function*>(OldFunc),
>>> + const_cast<BasicBlock*>(BB));
>>> + VMap[OldBBAddr] = BlockAddress::get(NewFunc, NewBB);
>>> + }
>>> +
>>> + OrigBBs.insert(BB);
>>> + VMap[BB] = NewBB;
>>> + // Iterate over all possible successors and add them to the CloneWorklist.
>>> + const TerminatorInst *Term = BB->getTerminator();
>>> + for (unsigned i = 0, e = Term->getNumSuccessors(); i != e; ++i) {
>>> + BasicBlock *Succ = Term->getSuccessor(i);
>>> + CloneWorklist.push_back(Succ);
>>> + }
>>> + }
>>> +
>>> + // Now, fill only the reachable blocks with the cloned contents
>>> + // of the originals.
>>> + assert(CloneWorklist.empty() && "Dirty worklist before re-use\n");
>>> CloneWorklist.push_back(&OldFunc->getEntryBlock());
>>> while (!CloneWorklist.empty()) {
>>> const BasicBlock *BB = CloneWorklist.back();
>>> CloneWorklist.pop_back();
>>> - PFC.CloneBlock(BB, CloneWorklist);
>>> + PFC.CloneBlock(BB, cast<BasicBlock>(VMap[BB]), CloneWorklist,
>>> + OrigBBs);
>>> }
>>> -
>>> +
>>> + // FIXME: Delete BB's that were created but have been pruned.
>>> + // Actual cloning may have found pruning opportunities since
>>> + // branch or switch statement target may have been known at compile-time.
>>> + // Alternatively we could write a routine CloneFunction and add a) a
>>> + // parameter to actually do the cloning and b) a return parameter that
>>> + // gives a list of blocks that need to be cloned also. Then we could
>>> + // call CloneFunction when we collect the blocks to clone, but suppress
>>> + // cloning. And then actually *do* the cloning in the while loop above. Then
>>> + // the cleanup here would become redundant, and so would be the OrigBBs.
>>> + for (std::set<const BasicBlock *>::iterator Oi = OrigBBs.begin(),
>>> + Oe = OrigBBs.end(); Oi != Oe; ++Oi) {
>>> + const BasicBlock *Orig = *Oi;
>>> + BasicBlock *NewBB = cast<BasicBlock>(VMap[Orig]);
>>> + delete NewBB;
>>> + VMap[Orig] = 0;
>>> + }
>>> +
>>> // Loop over all of the basic blocks in the old function. If the block was
>>> // reachable, we have cloned it and the old block is now in the value map:
>>> // insert it into the new function in the right order. If not, ignore it.
>>> @@ -454,7 +525,8 @@ void llvm::CloneAndPruneFunctionInto(Fun
>>> BI != BE; ++BI) {
>>> Value *V = VMap[BI];
>>> BasicBlock *NewBB = cast_or_null<BasicBlock>(V);
>>> - if (!NewBB) continue; // Dead block.
>>> + if (!NewBB)
>>> + continue; // Dead block.
>>>
>>> // Add the new block to the new function.
>>> NewFunc->getBasicBlockList().push_back(NewBB);
>>>
>>>
>>> _______________________________________________
>>> llvm-commits mailing list
>>> llvm-commits at cs.uiuc.edu
>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>> _______________________________________________
>> llvm-commits mailing list
>> llvm-commits at cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20140501/4764efc0/attachment.html>
More information about the llvm-commits
mailing list