[LLVMdev] Inline hint for methods defined in-class

Thu Jul 9 09:39:50 PDT 2015

On Wed, Jul 8, 2015 at 10:40 PM, Chandler Carruth <chandlerc at google.com> wrote:
> On Wed, Jul 8, 2015 at 1:46 PM Hal Finkel <hfinkel at anl.gov> wrote:
>>
>> ----- Original Message -----
>> > From: "Xinliang David Li" <davidxl at google.com>
>> > To: "Chandler Carruth" <chandlerc at gmail.com>
>> > Cc: cfe-commits at cs.uiuc.edu, "<llvmdev at cs.uiuc.edu> List"
>> > <llvmdev at cs.uiuc.edu>
>> > Sent: Wednesday, July 8, 2015 12:25:18 AM
>> > Subject: Re: [LLVMdev] Inline hint for methods defined in-class
>> >
>> > On Tue, Jul 7, 2015 at 6:06 PM, Chandler Carruth
>> > <chandlerc at gmail.com> wrote:
>> > > On Tue, Jul 7, 2015 at 4:11 PM Easwaran Raman <eraman at google.com>
>> > > wrote:
>> > >>
>> > >> I'm reviving this thread after a while and CCing cfe-commits as
>> > >> suggested by David Blaikie. I've also collected numbers building
>> > >> chrome (from chromium, on Linux) with and without this patch as
>> > >> suggested by David. I've re-posted the proposed patch and
>> > >> performance/size numbers collected at the top to make it easily
>> > >> readable for those reading it through cfe-commits.
>> > >
>> > >
>> > > First off, thanks for collecting the numbers and broadening the
>> > > distribution. Also, sorry it took me so long to get over to this
>> > > thread.
>> > >
>> > > I want to lay out my stance on this issue at a theoretical and
>> > > practical
>> > > level first. I'll follow up with thoughts on the numbers as well
>> > > after that.
>> > >
>> > > I think that tying *any* optimizer behavior to the 'inline' keyword
>> > > is
>> > > fundamentally the wrong direction.
>> >
>> > Chandler, thanks for sharing your thought -- however I don't think it
>> > is wrong, let alone 'fundamentally wrong'.  Despite all the analysis
>> > that can be done, the inliner is in the end heuristic based. In lack
>> > of the profile data, when inlining two calls yield the same static
>> > benefit and size cost, it is reasonable for the inliner to think the
>> > call to the function with inline hint to yield more high
>> > dynamic/runtime benefit -- thus it has a higher static size budget to
>> > burn.
>> >
>> > >We have reasons why we have done this
>> > > historically, and we can't just do an immediate about face, but we
>> > > should be
>> > > actively looking for ways to *reduce* the optimizer's reliance on
>> > > this
>> > > keyword to convey any meaning whatsoever.
>> >
>> > yes those additional things will be done, but they are really
>> > orthogonal.
>> >
>> > >
>> > > The reason I think that is the correct direction is because, for
>> > > better or
>> > > worse, the 'inline' keyword in C++ is not about optimization, but
>> > > about
>> > > linkage.
>> >
>> > It is about both optimization and linkage. In fact the linkage simply
>> > serves as an implementation detail. In C++ standard 7.1.2,  paragraph
>> > 2 says:
>>
>> The fact that C++ combines, into one keyword, a change in semantics
>> (linkage) and an optimization hint is quite unfortunate. I wish it were
>> otherwise.
>
>
> We could work to change it? I specifically proposed adding a way to move
> away from this unfortunate design.
>
>>
>> However, as it stands, I support this change. The benchmark numbers are
>> encouraging, and it replaces an implementation quirk with the underlying
>> (unfortunate) language design choice. The implementation quirk is that
>> putting the inline keyword on an in-class function definition changes the
>> behavior of the optimizer. However, according to the language specification,
>> that definition should have implied that keyword. While an implementation is
>> certainly free to do arbitrary things with hints, this behavior violates the
>> spirit of the language specification.
>
>
> I strongly disagree that this is the spirit of the language specification.
> Even if it was historically, I think we should move away from that. The
> language shouldn't be trying to do this with a language keyword, and it
> shouldn't be coupling semantics to hints. I'm very happy to take this up
> with the committee, but I don't see why we shouldn't push Clang in that
> direction here when there is no issue of conformance.
>
> To see how broken this is, let's look at how miserably small the difference
> is between the un-hinted and hinted thresholds. We've ended up shrinking
> this difference over time in LLVM because increasing the hinted threshold
> caused lots of performance regressions and size regressions.
>
>>
>> It makes a meaningless use of a standardized keyword meaningful, and
>> that's the greater transgression.
>
>
> So here is what I want to do:
>
> 1) Add a non-semantic attribute that conveys this hint. We could even convey
> a much *stronger* hint with this rather than just a tiny hint the way it is
> today because it wouldn't end up being forced onto every template regardless
> of whether that makes sense.
>
> 2) Start lobbying to remove the hint from the 'inline' keyword by working
> with the people who see regressions from this to use the new annotation to
> recover the performance.
>
> 3) Completely remove the semantic coupling of the optimizer hint and fix the
> meaningless use of the standardized keyword at the same time.
>
> But the more places where we use the inline hint today, the harder #2 will
> become. I've already tried once before to remove the hint and couldn't
> because of benchmarks that had been tightly tuned and coupled to the
> existing (quirky) behavior. I really think that doing this more will make
> getting to #3 harder. Making progress toward a cleaner design harder seems
> worse than coping with the quirks that have existed in Clang for over 5
> years for another few years.
>
>>
>> In addition, it does tend to be the case that in-class function
>> definitions are small and suitable for inlining.
>
>
> But if they are small and suitable for inlining, won't the existing
> threshold work just fine?
>
>>
>>
>> I agree. A 1% performance increase is worth a 4% code-size increase when
>> not optimizing for size.
>
>
> I really don't. Maybe in -O3 or something, but a 4% code-size increase is a
> hard regression to swallow. Especially in a single benchmark, and where many
> other benchmarks show no benefit at all. This isn't a matter of "most code
> gets better, so we need to tolerate the unfortunate variance of size".

The numbers I presented under Google internal benchmarks is a geomean
of 20 benchmarks and many of them show benefit (and some show a
performance regression). In terms of size, only clang shows > 1% size
increase, so that's the real outlier here.

- Easwaran

> I know its not really "fair" to view regressions as more important than
> missed opportunities, but the reality is that regressions *are* more
> problematic than missed opportunities.
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>