patch for functions with indirect calls and always_inline attribute

Eric Christopher echristo at gmail.com
Wed Apr 16 10:11:25 PDT 2014


Knew I was forgetting something. And yeah maybe we can use Quentin's new
diagnostics to warn on this also since we could be doing cross module
inlining and the front end would have no idea.
On Apr 16, 2014 2:21 AM, "Nick Lewycky" <nicholas at mxc.ca> wrote:

> Hal Finkel wrote:
>
>> ----- Original Message -----
>>
>>> From: "Eric Christopher"<echristo at gmail.com>
>>> To: "Gerolf Hoflehner"<ghoflehner at apple.com>
>>> Cc: "llvm-commits"<llvm-commits at cs.uiuc.edu>
>>> Sent: Monday, April 14, 2014 6:11:06 PM
>>> Subject: Re: patch for functions with indirect calls and always_inline
>>>  attribute
>>>
>>> On Mon, Apr 14, 2014 at 4:09 PM, Gerolf Hoflehner
>>> <ghoflehner at apple.com>  wrote:
>>>
>>>> Please see below.
>>>>
>>>> On Apr 14, 2014, at 2:56 PM, Eric Christopher<echristo at gmail.com>
>>>> wrote:
>>>>
>>>>  I could be down for this...
>>>>>
>>>>> Do you have any numbers?
>>>>>
>>>> No. I ran into this when creating a test case for another bug fix.
>>>>
>>>
>>> OK, would you mind running some performance numbers on this? It
>>> passes
>>> my sniff test as well, but I'd like to not be wrong :)
>>>
>>> If the numbers come out neutral/ahead/no change feel free to commit.
>>>
>>
>> And if they don't? always_inline means *always inline* -- this is a
>> correctness issue, and we should do it regardless of performance
>> considerations.
>>
>> I am curious, however, why we have this restriction at all. Indirect
>> dispatch is expensive, compounding that expense with the call overhead of a
>> small function seems suboptimal; plus, for hardware that has indirect
>> branch prediction (Intel, ARM, and others), inlining should give the branch
>> predictor a better chance at doing the right thing.
>>
>
> I thought this restriction was there for correctness? Does the function
> cloning actually update blockaddress to point to the new target function?
> Even if the blockaddress is hidden inside an arbitrarily complex constant
> expression instead of being a direct operand of an instruction? (Possible,
> but you may want to visit all users of the BBs instead.)
>
> Also, always_inline is defeated when applied to varargs functions, because
> the inliner can't inline those either.
>
> Nick
>
>  Thanks!
>>>
>>> -eric
>>>
>>>
>>>>> Nits on the patch:
>>>>>
>>>>>      // Disallow inlining of functions which contain an indirect
>>>>>      branch.
>>>>> -    if (isa<IndirectBrInst>(BI->getTerminator()))
>>>>> +    if (isa<IndirectBrInst>(BI->getTerminator())&&
>>>>> +        !F.hasFnAttribute(Attribute::AlwaysInline))
>>>>>
>>>>> Update the comment.
>>>>>
>>>> Ok.
>>>>
>>>>> I'm assuming this can't be hoisted because of the
>>>>> returns_twice aspect yes?
>>>>>
>>>>
>>>> Yes, for now I only wanted to address indirect branches and leave
>>>> remaining checks as they are.
>>>>
>>>>
>>>>> -define i32 @inner5(i8* %addr) alwaysinline {
>>>>> +define i32 @inner5(i8* %addr) {
>>>>>
>>>>> Why the change?
>>>>>
>>>> This is to keep the old test behavior which relied on *not*
>>>> inlining inner5 because of an indirect branch.
>>>>
>>>>>
>>>>> -eric
>>>>>
>>>>> On Mon, Apr 14, 2014 at 2:49 PM, Gerolf Hoflehner
>>>>> <ghoflehner at apple.com>  wrote:
>>>>>
>>>>>> Hi
>>>>>>
>>>>>> llvm currently does not inline functions with indirect branches.
>>>>>> We had some
>>>>>> internal discussion and think the always_inline attribute should
>>>>>> overrule
>>>>>> that restriction.
>>>>>>
>>>>>> Gerolf
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> Index: lib/Analysis/IPA/InlineCost.cpp
>>>>>> ===================================================================
>>>>>> --- lib/Analysis/IPA/InlineCost.cpp     (revision 206204)
>>>>>> +++ lib/Analysis/IPA/InlineCost.cpp     (working copy)
>>>>>> @@ -1301,7 +1301,8 @@
>>>>>>                                     Attribute::ReturnsTwice);
>>>>>>    for (Function::iterator BI = F.begin(), BE = F.end(); BI != BE;
>>>>>>    ++BI) {
>>>>>>      // Disallow inlining of functions which contain an indirect
>>>>>>      branch.
>>>>>> -    if (isa<IndirectBrInst>(BI->getTerminator()))
>>>>>> +    if (isa<IndirectBrInst>(BI->getTerminator())&&
>>>>>> +        !F.hasFnAttribute(Attribute::AlwaysInline))
>>>>>>        return false;
>>>>>>
>>>>>>
>>>>>>
>>>>>>      for (BasicBlock::iterator II = BI->begin(), IE = BI->end();
>>>>>>      II != IE;
>>>>>> Index: test/Transforms/Inline/always-inline-attribute.ll
>>>>>> ===================================================================
>>>>>> --- test/Transforms/Inline/always-inline-attribute.ll   (revision
>>>>>> 0)
>>>>>> +++ test/Transforms/Inline/always-inline-attribute.ll   (working
>>>>>> copy)
>>>>>> @@ -0,0 +1,26 @@
>>>>>> +; RUN: opt<  %s -O3 -S | FileCheck %s
>>>>>> + at gv = external global i32
>>>>>> +
>>>>>> +define i32 @main() nounwind {
>>>>>> +; CHECK-NOT: call i32 @foo
>>>>>> +  %1 = load i32* @gv, align 4
>>>>>> +  %2 = tail call i32 @foo(i32 %1)
>>>>>> +  unreachable
>>>>>> +}
>>>>>> +
>>>>>> +define internal i32 @foo(i32) alwaysinline {
>>>>>> +  br label %2
>>>>>> +
>>>>>> +;<label>:2                                       ; preds = %8,
>>>>>> %1
>>>>>> +  %3 = phi i32 [ %0, %1 ], [ %10, %8 ]
>>>>>> +  %4 = phi i8* [ blockaddress(@foo, %2), %1 ], [ %6, %8 ]
>>>>>> +  %5 = icmp eq i32 %3, 1
>>>>>> +  %6 = select i1 %5, i8* blockaddress(@foo, %8), i8* %4
>>>>>> +  %7 = add nsw i32 %3, -1
>>>>>> +  br label %8
>>>>>> +
>>>>>> +;<label>:8                                       ; preds = %8,
>>>>>> %2
>>>>>> +  %9 = phi i32 [ %7, %2 ], [ %10, %8 ]
>>>>>> +  %10 = add nsw i32 %9, -1
>>>>>> +  indirectbr i8* %6, [label %2, label %8]
>>>>>> +}
>>>>>> Index: test/Transforms/Inline/always-inline.ll
>>>>>> ===================================================================
>>>>>> --- test/Transforms/Inline/always-inline.ll     (revision 206203)
>>>>>> +++ test/Transforms/Inline/always-inline.ll     (working copy)
>>>>>> @@ -78,7 +78,7 @@
>>>>>>    ret i32 %add
>>>>>> }
>>>>>>
>>>>>>
>>>>>>
>>>>>> -define i32 @inner5(i8* %addr) alwaysinline {
>>>>>> +define i32 @inner5(i8* %addr) {
>>>>>> entry:
>>>>>>    indirectbr i8* %addr, [ label %one, label %two ]
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> llvm-commits mailing list
>>>>>> llvm-commits at cs.uiuc.edu
>>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>>>>>
>>>>>>
>>>>  _______________________________________________
>>> llvm-commits mailing list
>>> llvm-commits at cs.uiuc.edu
>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20140416/8d005bea/attachment.html>


More information about the llvm-commits mailing list