[cfe-dev] [RFC] AlwaysInline codegen

Evgenii Stepanov via cfe-dev cfe-dev at lists.llvm.org
Fri Aug 28 10:55:54 PDT 2015


Are there any concerns remaining?
I'd like to go ahead with http://reviews.llvm.org/D12087

On Fri, Aug 21, 2015 at 3:40 PM, Richard Smith <richard at metafoo.co.uk> wrote:
> On Fri, Aug 21, 2015 at 1:23 PM, John McCall <rjmccall at apple.com> wrote:
>>
>> On Aug 20, 2015, at 7:36 PM, Richard Smith <richard at metafoo.co.uk> wrote:
>>
>> On Thu, Aug 20, 2015 at 7:17 PM, John McCall <rjmccall at apple.com> wrote:
>>>
>>> > On Aug 20, 2015, at 5:19 PM, Evgenii Stepanov via cfe-dev
>>> > <cfe-dev at lists.llvm.org> wrote:
>>> > Hi,
>>> >
>>> > There is a problem with the handling of alwaysinline functions in
>>> > Clang: they are not always inlined. AFAIK, this may only happen when
>>> > the caller is in the dead code, but then we don't always successfully
>>> > remove all dead code.
>>> >
>>> > Because of this, we may end up emitting an undefined reference for an
>>> > "inline __attribute__((always_inline))" function. Libc++ relies on the
>>> > compiler never doing that - it has lots of functions in the headers
>>> > marked this way and does _not_ export them from libc++.so.
>>> >
>>> > Current implementation in clang emits alwaysinline+inline functions as
>>> > available_externally definitions. The inliner is an SCC pass, and as
>>> > such it does not process unreachable functions at all. This means that
>>> > AlwaysInliner may leave some alwaysinline functions not inlined. If
>>> > such function has an available_externally linkage, it is not emitted
>>> > into the binary, and all calls to it are emitted as undefined symbol
>>> > references.
>>> >
>>> > Some time ago I've made an attempt to add a DCE pass before the
>>> > AlwaysInliner pass to fix this. Thst
>>> > (a) caused a big churn in the existing tests
>>> > (b) must be done at -O0 as well, which is probably undesirable and
>>> > could inflate compilation time
>>> > (c) feels like patching over a bigger problem.
>>> >
>>> > The following, better, idea was suggested by Chandler Carruth and
>>> > Richard Smith.
>>> >
>>> > Instead of emitting an available_externally definition for an
>>> > alwaysinline function, we emit a pair of
>>> > 1. internal alwaysinline definition (let's call it F.inlinefunction -
>>> > it demangles nicely)
>>> > 2a. A stub F() { musttail call F.inlinefunction }
>>> >  -- or --
>>> > 2b. A declaration of F.
>>>
>>> I have no idea why always_inline function definitions are being marked as
>>> available_externally.  There is zero reason to think that they’re
>>> actually
>>> available externally, and there’s zero benefit to the optimizer from this
>>> information because inlining is forced anyway, so this lie is both
>>> terrible
>>> and pointless.
>>
>>
>> The interesting case is a declaration like this:
>>
>>   extern inline __attribute__((gnu_inline, always_inline)) void f() {
>> /*...*/ }
>>
>> These are common in the glibc headers, and they mean:
>>
>> 1) This is a body just for inlining, and there is a strong definition
>> provided elsewhere (gnu_inline + extern inline), probably with an entirely
>> different body.
>> 2) If you see a direct call to this function, you must inline it.
>>
>> The intention appears to be to create a function definition that is only
>> used for inlining, and is never itself emitted. Taking the address of the
>> function should give the address of the strong definition from elsewhere,
>> not the address of this local definition of f.
>>
>> So, if we only want a single IR-level function for this case, it must be
>> available_externally (for (1)), and must be always_inline (for (2)). But
>> that results in the problem that Evgeniy reported.
>>
>>
>> How so?  You have a call that doesn’t get inlined for whatever reason, and
>> it resolves against an external symbol that’s guaranteed to exist.
>> Sub-optimal, but easily fixable by DCE'ing the dead call.  I don’t see how
>> this fails to link unless, what, the program is using the libc headers and
>> not linking against libc?  Do we really need to fall over ourselves to
>> support that?
>>
>> Evgeniy seemed to be describing a situation where the external symbol
>> *wasn’t* guaranteed to exist; but in that case, gnu_inline is a clear lie.
>
>
> The combination of flags above implies that if all calls are direct calls,
> then no external definition is required. Some headers rely on that, such as
> clang's own <htmxlintrin.h>.



More information about the cfe-dev mailing list