[cfe-dev] [LLVMdev] weak_odr constant versus weak_odr global

John McCall rjmccall at apple.com
Sun Nov 27 05:00:01 PST 2011


On Nov 21, 2011, at 9:05 AM, Rafael EspĂ­ndola wrote:
>> Unfortunately, making the comdat be for the entire function is not
>> conformant with the ABI, which says that you either put the variable
>> and its guard in different comdats or you put them in a single comdat
>> named for the variable.  It also doesn't actually help unless we disable
>> inlining.
> 
> I see. Using two comdats would still cause the same problem for us,
> no? So the solution in the end is to emit:
> 
> TU1:
> --------------------------------
> @_ZN1UI1SE1kE = weak_odr constant i32 42, align 4, comdat _ZN1UI1SE1kE
> @_ZGVN1UI1SE1kE = weak_odr global i64 1, comdat _ZN1UI1SE1kE
> --------------------------------
> 
> TU2:
> -----------------------------------
> @_ZN1UI1SE1kE = weak_odr global i32 0, align 4, comdat _ZN1UI1SE1kE
> @_ZGVN1UI1SE1kE = weak_odr global i64 0, comdat _ZN1UI1SE1kE
> ...
> @llvm.global_ctors = ....
> define internal void @_GLOBAL__I_a() nounwind section ".text.startup" ....
> -----------------------------------

Exactly.

To sketch out the proposed IR extension a bit more:
1.  We add 'comdat "name"' to the global variable and function
productions.  I have the COMDAT name in quotes only because
there's no other precedent for a bare identifier in the IR grammar.
I don't think we want to allow this on aliases;  I think I could
probably invent reasonable semantics, but it's really not worth
worrying about without cause.
2.  A symbol with a COMDAT name must be a definition.
3.  All symbols sharing the same COMDAT name are required to
share the same linkage and visibility.  Conveniently, this lets us
talk about the COMDAT's linkage / etc.
4.  A symbol with a COMDAT name is considered to be referenced
if any symbol with the same COMDAT name is referenced
(ignoring this rule).
5.  It's undefined behavior if two modules are linked and they
export different sets of symbols with a given COMDAT name.
6.  Otherwise, if two modules are linked and they both export
symbols with a given COMDAT name, all the symbols must be
taken from the same module.

I think that covers it.

The implementation can be optimized around the following
properties of the typical use patterns in the C++ ABI.
a) Most symbols do not need COMDAT names.  Or they
don't need "non-trivial" COMDAT names, i.e. COMDAT names
containing other symbols or not matching their own name.
b) When symbols do need COMDAT names, we'll almost
always know exactly how many symbols are going in the group.
That number will usually be two.
c) It's frequently going to be convenient to be able to add
a COMDAT name to a GV after the GV was allocated.
d) Otherwise, symbols will probably never need to change
or remove their COMDAT name, and we probably don't even
need to add API for it.
e) Many clients are going to want to be able to efficiently test
whether a symbol is in a COMDAT group.
f) Those clients will also generally care about efficiently
iterating over all the symbols in that group.

I'd suggest having a bit on GlobalValue and a side-table
on the Module mapping from GVs to COMDAT objects,
where COMDAT objects are allocated as part of the
StringMapEntry on the Module and don't really contain
any data except their name and a list of GV*s, heavily
optimized for the two-element case.

John.



More information about the cfe-dev mailing list