patch for functions with indirect calls and always_inline attribute

Hal Finkel hfinkel at anl.gov
Wed Apr 16 04:13:00 PDT 2014


----- Original Message -----
> From: "Nick Lewycky" <nicholas at mxc.ca>
> To: "Hal Finkel" <hfinkel at anl.gov>
> Cc: "Eric Christopher" <echristo at gmail.com>, "llvm-commits" <llvm-commits at cs.uiuc.edu>
> Sent: Wednesday, April 16, 2014 4:25:40 AM
> Subject: Re: patch for functions with indirect calls and always_inline attribute
> 
> Hal Finkel wrote:
> > ----- Original Message -----
> >> From: "Eric Christopher"<echristo at gmail.com>
> >> To: "Gerolf Hoflehner"<ghoflehner at apple.com>
> >> Cc: "llvm-commits"<llvm-commits at cs.uiuc.edu>
> >> Sent: Monday, April 14, 2014 6:11:06 PM
> >> Subject: Re: patch for functions with indirect calls and
> >> always_inline	attribute
> >>
> >> On Mon, Apr 14, 2014 at 4:09 PM, Gerolf Hoflehner
> >> <ghoflehner at apple.com>  wrote:
> >>> Please see below.
> >>>
> >>> On Apr 14, 2014, at 2:56 PM, Eric Christopher<echristo at gmail.com>
> >>> wrote:
> >>>
> >>>> I could be down for this...
> >>>>
> >>>> Do you have any numbers?
> >>> No. I ran into this when creating a test case for another bug
> >>> fix.
> >>
> >> OK, would you mind running some performance numbers on this? It
> >> passes
> >> my sniff test as well, but I'd like to not be wrong :)
> >>
> >> If the numbers come out neutral/ahead/no change feel free to
> >> commit.
> >
> > And if they don't? always_inline means *always inline* -- this is a
> > correctness issue, and we should do it regardless of performance
> > considerations.
> >
> > I am curious, however, why we have this restriction at all.
> > Indirect dispatch is expensive, compounding that expense with the
> > call overhead of a small function seems suboptimal; plus, for
> > hardware that has indirect branch prediction (Intel, ARM, and
> > others), inlining should give the branch predictor a better chance
> > at doing the right thing.
> 
> I thought this restriction was there for correctness? Does the
> function
> cloning actually update blockaddress to point to the new target
> function? Even if the blockaddress is hidden inside an arbitrarily
> complex constant expression instead of being a direct operand of an
> instruction? (Possible, but you may want to visit all users of the
> BBs
> instead.)

Yep; a few people have mentioned this (thanks everyone!) -- we'd need to do an escape analysis on the block address.

> 
> Also, always_inline is defeated when applied to varargs functions,
> because the inliner can't inline those either.

This is something obvious in the frontend; we should have a warning about this ;)

 -Hal

> 
> Nick
> 
> >> Thanks!
> >>
> >> -eric
> >>
> >>>>
> >>>> Nits on the patch:
> >>>>
> >>>>      // Disallow inlining of functions which contain an indirect
> >>>>      branch.
> >>>> -    if (isa<IndirectBrInst>(BI->getTerminator()))
> >>>> +    if (isa<IndirectBrInst>(BI->getTerminator())&&
> >>>> +        !F.hasFnAttribute(Attribute::AlwaysInline))
> >>>>
> >>>> Update the comment.
> >>> Ok.
> >>>> I'm assuming this can't be hoisted because of the
> >>>> returns_twice aspect yes?
> >>>
> >>> Yes, for now I only wanted to address indirect branches and leave
> >>> remaining checks as they are.
> >>>
> >>>>
> >>>> -define i32 @inner5(i8* %addr) alwaysinline {
> >>>> +define i32 @inner5(i8* %addr) {
> >>>>
> >>>> Why the change?
> >>> This is to keep the old test behavior which relied on *not*
> >>> inlining inner5 because of an indirect branch.
> >>>>
> >>>> -eric
> >>>>
> >>>> On Mon, Apr 14, 2014 at 2:49 PM, Gerolf Hoflehner
> >>>> <ghoflehner at apple.com>  wrote:
> >>>>> Hi
> >>>>>
> >>>>> llvm currently does not inline functions with indirect
> >>>>> branches.
> >>>>> We had some
> >>>>> internal discussion and think the always_inline attribute
> >>>>> should
> >>>>> overrule
> >>>>> that restriction.
> >>>>>
> >>>>> Gerolf
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>> Index: lib/Analysis/IPA/InlineCost.cpp
> >>>>> ===================================================================
> >>>>> --- lib/Analysis/IPA/InlineCost.cpp     (revision 206204)
> >>>>> +++ lib/Analysis/IPA/InlineCost.cpp     (working copy)
> >>>>> @@ -1301,7 +1301,8 @@
> >>>>>                                     Attribute::ReturnsTwice);
> >>>>>    for (Function::iterator BI = F.begin(), BE = F.end(); BI !=
> >>>>>    BE;
> >>>>>    ++BI) {
> >>>>>      // Disallow inlining of functions which contain an
> >>>>>      indirect
> >>>>>      branch.
> >>>>> -    if (isa<IndirectBrInst>(BI->getTerminator()))
> >>>>> +    if (isa<IndirectBrInst>(BI->getTerminator())&&
> >>>>> +        !F.hasFnAttribute(Attribute::AlwaysInline))
> >>>>>        return false;
> >>>>>
> >>>>>
> >>>>>
> >>>>>      for (BasicBlock::iterator II = BI->begin(), IE =
> >>>>>      BI->end();
> >>>>>      II != IE;
> >>>>> Index: test/Transforms/Inline/always-inline-attribute.ll
> >>>>> ===================================================================
> >>>>> --- test/Transforms/Inline/always-inline-attribute.ll
> >>>>>   (revision
> >>>>> 0)
> >>>>> +++ test/Transforms/Inline/always-inline-attribute.ll
> >>>>>   (working
> >>>>> copy)
> >>>>> @@ -0,0 +1,26 @@
> >>>>> +; RUN: opt<  %s -O3 -S | FileCheck %s
> >>>>> + at gv = external global i32
> >>>>> +
> >>>>> +define i32 @main() nounwind {
> >>>>> +; CHECK-NOT: call i32 @foo
> >>>>> +  %1 = load i32* @gv, align 4
> >>>>> +  %2 = tail call i32 @foo(i32 %1)
> >>>>> +  unreachable
> >>>>> +}
> >>>>> +
> >>>>> +define internal i32 @foo(i32) alwaysinline {
> >>>>> +  br label %2
> >>>>> +
> >>>>> +;<label>:2                                       ; preds = %8,
> >>>>> %1
> >>>>> +  %3 = phi i32 [ %0, %1 ], [ %10, %8 ]
> >>>>> +  %4 = phi i8* [ blockaddress(@foo, %2), %1 ], [ %6, %8 ]
> >>>>> +  %5 = icmp eq i32 %3, 1
> >>>>> +  %6 = select i1 %5, i8* blockaddress(@foo, %8), i8* %4
> >>>>> +  %7 = add nsw i32 %3, -1
> >>>>> +  br label %8
> >>>>> +
> >>>>> +;<label>:8                                       ; preds = %8,
> >>>>> %2
> >>>>> +  %9 = phi i32 [ %7, %2 ], [ %10, %8 ]
> >>>>> +  %10 = add nsw i32 %9, -1
> >>>>> +  indirectbr i8* %6, [label %2, label %8]
> >>>>> +}
> >>>>> Index: test/Transforms/Inline/always-inline.ll
> >>>>> ===================================================================
> >>>>> --- test/Transforms/Inline/always-inline.ll     (revision
> >>>>> 206203)
> >>>>> +++ test/Transforms/Inline/always-inline.ll     (working copy)
> >>>>> @@ -78,7 +78,7 @@
> >>>>>    ret i32 %add
> >>>>> }
> >>>>>
> >>>>>
> >>>>>
> >>>>> -define i32 @inner5(i8* %addr) alwaysinline {
> >>>>> +define i32 @inner5(i8* %addr) {
> >>>>> entry:
> >>>>>    indirectbr i8* %addr, [ label %one, label %two ]
> >>>>>
> >>>>>
> >>>>> _______________________________________________
> >>>>> llvm-commits mailing list
> >>>>> llvm-commits at cs.uiuc.edu
> >>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
> >>>>>
> >>>
> >> _______________________________________________
> >> llvm-commits mailing list
> >> llvm-commits at cs.uiuc.edu
> >> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
> >>
> >
> 
> 

-- 
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory



More information about the llvm-commits mailing list