[cfe-dev] [RFC] AlwaysInline codegen

John McCall via cfe-dev cfe-dev at lists.llvm.org
Thu Aug 20 19:17:27 PDT 2015


> On Aug 20, 2015, at 5:19 PM, Evgenii Stepanov via cfe-dev <cfe-dev at lists.llvm.org> wrote:
> Hi,
> 
> There is a problem with the handling of alwaysinline functions in
> Clang: they are not always inlined. AFAIK, this may only happen when
> the caller is in the dead code, but then we don't always successfully
> remove all dead code.
> 
> Because of this, we may end up emitting an undefined reference for an
> "inline __attribute__((always_inline))" function. Libc++ relies on the
> compiler never doing that - it has lots of functions in the headers
> marked this way and does _not_ export them from libc++.so.
> 
> Current implementation in clang emits alwaysinline+inline functions as
> available_externally definitions. The inliner is an SCC pass, and as
> such it does not process unreachable functions at all. This means that
> AlwaysInliner may leave some alwaysinline functions not inlined. If
> such function has an available_externally linkage, it is not emitted
> into the binary, and all calls to it are emitted as undefined symbol
> references.
> 
> Some time ago I've made an attempt to add a DCE pass before the
> AlwaysInliner pass to fix this. Thst
> (a) caused a big churn in the existing tests
> (b) must be done at -O0 as well, which is probably undesirable and
> could inflate compilation time
> (c) feels like patching over a bigger problem.
> 
> The following, better, idea was suggested by Chandler Carruth and Richard Smith.
> 
> Instead of emitting an available_externally definition for an
> alwaysinline function, we emit a pair of
> 1. internal alwaysinline definition (let's call it F.inlinefunction -
> it demangles nicely)
> 2a. A stub F() { musttail call F.inlinefunction }
>  -- or --
> 2b. A declaration of F.

I have no idea why always_inline function definitions are being marked as
available_externally.  There is zero reason to think that they’re actually
available externally, and there’s zero benefit to the optimizer from this
information because inlining is forced anyway, so this lie is both terrible
and pointless.

I don’t understand the goal of this stub idea.  We should just emit always_inline
function definitions normally, but capping their linkage to hidden+linkonce_odr
if not already internal.  If the AlwaysInliner fails to inline a dead call, fine, we’ll
just emit a dead function body, too.  (I mean, this seems like inexcusable
backend behavior to me, but whatever.)

John.


More information about the cfe-dev mailing list