[llvm-dev] [cfe-dev] [RFC] AlwaysInline codegen

Richard Smith via llvm-dev llvm-dev at lists.llvm.org
Fri Aug 21 15:40:31 PDT 2015

On Fri, Aug 21, 2015 at 1:23 PM, John McCall <rjmccall at apple.com> wrote:

> On Aug 20, 2015, at 7:36 PM, Richard Smith <richard at metafoo.co.uk> wrote:
> On Thu, Aug 20, 2015 at 7:17 PM, John McCall <rjmccall at apple.com> wrote:
>> > On Aug 20, 2015, at 5:19 PM, Evgenii Stepanov via cfe-dev <
>> cfe-dev at lists.llvm.org> wrote:
>> > Hi,
>> >
>> > There is a problem with the handling of alwaysinline functions in
>> > Clang: they are not always inlined. AFAIK, this may only happen when
>> > the caller is in the dead code, but then we don't always successfully
>> > remove all dead code.
>> >
>> > Because of this, we may end up emitting an undefined reference for an
>> > "inline __attribute__((always_inline))" function. Libc++ relies on the
>> > compiler never doing that - it has lots of functions in the headers
>> > marked this way and does _not_ export them from libc++.so.
>> >
>> > Current implementation in clang emits alwaysinline+inline functions as
>> > available_externally definitions. The inliner is an SCC pass, and as
>> > such it does not process unreachable functions at all. This means that
>> > AlwaysInliner may leave some alwaysinline functions not inlined. If
>> > such function has an available_externally linkage, it is not emitted
>> > into the binary, and all calls to it are emitted as undefined symbol
>> > references.
>> >
>> > Some time ago I've made an attempt to add a DCE pass before the
>> > AlwaysInliner pass to fix this. Thst
>> > (a) caused a big churn in the existing tests
>> > (b) must be done at -O0 as well, which is probably undesirable and
>> > could inflate compilation time
>> > (c) feels like patching over a bigger problem.
>> >
>> > The following, better, idea was suggested by Chandler Carruth and
>> Richard Smith.
>> >
>> > Instead of emitting an available_externally definition for an
>> > alwaysinline function, we emit a pair of
>> > 1. internal alwaysinline definition (let's call it F.inlinefunction -
>> > it demangles nicely)
>> > 2a. A stub F() { musttail call F.inlinefunction }
>> >  -- or --
>> > 2b. A declaration of F.
>> I have no idea why always_inline function definitions are being marked as
>> available_externally.  There is zero reason to think that they’re actually
>> available externally, and there’s zero benefit to the optimizer from this
>> information because inlining is forced anyway, so this lie is both
>> terrible
>> and pointless.
> The interesting case is a declaration like this:
>   extern inline __attribute__((gnu_inline, always_inline)) void f() {
> /*...*/ }
> These are common in the glibc headers, and they mean:
> 1) This is a body just for inlining, and there is a strong definition
> provided elsewhere (gnu_inline + extern inline), probably with an entirely
> different body.
> 2) If you see a direct call to this function, you must inline it.
> The intention appears to be to create a function definition that is only
> used for inlining, and is never itself emitted. Taking the address of the
> function should give the address of the strong definition from elsewhere,
> not the address of this local definition of f.
> So, if we only want a single IR-level function for this case, it must be
> available_externally (for (1)), and must be always_inline (for (2)). But
> that results in the problem that Evgeniy reported.
> How so?  You have a call that doesn’t get inlined for whatever reason, and
> it resolves against an external symbol that’s guaranteed to exist.
> Sub-optimal, but easily fixable by DCE'ing the dead call.  I don’t see how
> this fails to link unless, what, the program is using the libc headers and
> not linking against libc?  Do we really need to fall over ourselves to
> support that?
> Evgeniy seemed to be describing a situation where the external symbol
> *wasn’t* guaranteed to exist; but in that case, gnu_inline is a clear lie.

The combination of flags above implies that if all calls are direct calls,
then no external definition is required. Some headers rely on that, such as
clang's own <htmxlintrin.h>.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150821/89e81fe5/attachment.html>

More information about the llvm-dev mailing list