[cfe-dev] modular codegen of class template static member variables

Richard Smith via cfe-dev cfe-dev at lists.llvm.org
Tue Nov 21 17:06:33 PST 2017


On 20 November 2017 at 15:04, David Blaikie via cfe-dev <
cfe-dev at lists.llvm.org> wrote:

> Hi Richard,
>
> (Lang, you're here because I mentioned stumbling across this on Friday in
> ORC - this is the reduced test case (where 't' is the NameMutex member and
> 'nt' is the Name member))
>
> Working on getting all LLVM binaries linking successfully under modular
> codegen, I've hit something that seems it'll need a bit more feature work
> (which I'm happy/planning to do myself - though always happy for
> help/advice/etc)...
>
> The test case I have boils down to the following modular header:
>
>   struct trivial {};
>   struct nontrivial { nontrivial(); };
>   // namespace foo {
>   void sink(void *);
>   template <typename T> struct bar {
>     static void baz() {
>       sink(&t);
>       sink(&nt);
>     }
>     static trivial t;
>     static nontrivial nt;
>   };
>   template <typename T> trivial bar<T>::t;
>   template <typename T> nontrivial bar<T>::nt;
>   //} // namespace foo
>   template struct bar<int>;
>   // inline void use() { (void)bar<int>::baz(); }
>
> To build with modular codegen:
>
>   $ echo 'module foo { header "foo.h" }' > foo.cppmap
>   $ clang++ -cc1 -xc++ -emit-module -fmodules -w -fmodule-name=foo
> foo.cppmap -o foo.pcm -fmodules-codegen
>
> So here are some interesting facts I know, some of which may be relevant,
> some of which may not:
>
>    1.  Code as written ends up with linkonce_odr definitions for t and nt
>    2. Use use instead of the explicit instantiation and are both t and nt are
>    only declarations
>    3. Add the outer namespace foo and then t is emitted as a linkonce_odr
>    definition and nt is emitted as a declaration
>
> That last one (which was the first result I got) really confuses me - any
> ideas why a namespace would change the behavior here?
>

My first guess would be that something has registered a
ASTConsumer::HandleTopLevelDecl callback or similar, and they're assuming
that it gets called for every namespace-scope declaration.


> In any case, all those mysteries/differences in behavior might be aside to
> actually fixing the behavior here, which is what this email is really about.
>
> This is basically the same problem as inline variables, and maybe even
> would allow some support for static variables in headers too (not sure,
> will see).
>
> Any ideas what the behavior should be here? Since there's a desire not to
> run all global initializers if their specific submodule header isn't
> included in the program (for iostream's sake), how would this be done
> correctly under modular codegen?
> My initial thought is potentially to defer the global initializers to the
> includers (that seems necessary to get the lazy/only-those-included
> behavior, right?) But that may not account for indirect inclusion? I guess
> that's already handled somehow for the iostreams non-modular case, so maybe
> it works.
>

The explicit instantiation definition case is not especially interesting,
because by [temp.spec]/5.1, such things should never appear in modular
headers (because the header could only ever be used in one translation
unit).

So let's focus in on the "inline void use()" case. We need some kind of
mental model for what modular codegen means in order to figure out what
should happen. The way I'm thinking about modular codegen is roughly:

For each header for which we perform modular codegen, we act as if
 * that header is a separate translation unit in the program (in *addition*
to being included into other places), and
 * for that translation unit, we happen to emit definitions of inline
functions and class metadata, even if they're not otherwise used, and
 * in other translation units, we don't need to emit those symbols as a
consequence.

Under that model, emitting the definition of use() should cause us to emit
linkonce_odr definitions of both t and nt into the modular codegen
translation unit. But it should not suppress the emission of linkonce_odr
definitions of t and nt in other translation units too.

However, we also want to *not* run initializers for modules that are not
actually used (eg, we don't want linking against the standard library to
run the iostreams initializer -- and thus link in the iostreams library --
if it's not used, such as for a freestanding / embedded compilation). For
modular codegen, this presumably also needs function sections, and section
GC enabled in the linker.

& then the modular object file would perhaps have the weak_odr definition
> of the global variable itself, but no global initializer - depending on any
> live codepaths that reference the global necessarily requiring the using TU
> to have caused the initializer to run? That seems vaguely concerning...
>

It does. Mostly I think it works out: if another TU is relying on an inline
function definition to be provided by a modular codegen object, they must
have run the notional initializer for that module, which in turn would have
initialized those globals.

There's one case I'm concerned by: suppose the module is never actually
imported, and all the TUs actually include the header textually. Now,
suppose the inline function and global variables from the modular codegen
TU are selected at link time, and the other copies are all discarded, and
we cleverly put the per-variable global initializers in a COMDAT with the
variables, so they get discarded too. Now we're left with a reference to an
uninitialized global.

Perhaps we need to make a distinction between internal linkage globals with
dynamic initialization in headers (eg the iostreams initializer), which we
run in every user of the modular codegen header, and external linkage
globals with dynamic initialization in headers (eg, inline variables,
static data members of class templates, ...) which we run as part of
initializing the modular codegen translation unit itself. If we do make
that distinction, I worry that we'll lose some of the initialization order
guarantees, though.

Is this making sense? Any good ideas? Pointers to where to start, etc?
>
> Thanks,
> - Dave
>
>
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20171121/a6d3387a/attachment.html>


More information about the cfe-dev mailing list