[Patch] Use comdats for linkonce_odr constructors/destructors

Rafael EspĂ­ndola rafael.espindola at gmail.com
Sun Oct 19 13:23:46 PDT 2014


>> The attached patch makes clang use comdats for linkonce_odr structors.
>> This matches the gcc behavior and there are two potential advantages
>> that I can think off
>>
>> * Avoid a bit of code bloat when mixing gcc and clang compiled code
>> since they will have the same COMDATs.
>
>
> Can you say a bit more about the cases where this reduces code bloat? Either
> way, we'll end up with exactly one definition of each Dn variant that is
> used, won't we?

We end up resolving the symbol to one of the definitions, but if gcc
and clang put them in different COMDATS we can end up with a bit of
dead code in the final binary if not using -ffunction-sections and GC
in the linker. What happens is that during initial read in the linker
will only see two different comdats (D5 from GCC, D1/D2 from clang)
and keep both. With matching comdats the linker doesn't even need to
read the duplicated entries.

>> * Avoid pain to users that developed a dependency on the gcc behavior,
>> but it might be a bit late for that
>> (http://llvm.org/devmtg/2014-04/PDFs/Talks/NickRefactoring.pdf).
>
>
> I don't think making broken code work better has a lot of value; I think
> it's better to focus our efforts on making it easier to detect that code is
> broken.

I agree. I did start working on this because of a similar problem, but
I don't see it as a strong argument for upstream.

>> I was hoping this wouldn't have a noticeable code quality impact, but
>> I have seen at least two issues
>>
>> * Putting D0 in the D5 comdat is an ABI bug IMHO since it is never
>> equal to D1 or D2 and doing so with linkonce_odr can bloat the code a
>> bit by causing us to keep more functions then we need (keep D1/D2 if
>> they get inlined into D0 for example).
>
>
> It will cause us to emit more IR, but I don't see that it will cause us to
> keep more functions; the D0 will still be discardable if it ends up unused,
> right? (It looks like GNU ld's --gc-sections will keep sections that are
> within a used COMDAT, but gold's --gc-sections will remove them.)

That is correct. All the references about bloat in my email assume a
plain ELF linker, without using GC. With GC instead of bloat we just
get a bit more work in LLVM and the linker.

>> * Using a COMDAT for the destructors of a class bar prevents D2 from
>> being replaced with D2 of the base class. That means we end up with
>>
>> define linkonce_odr void @_ZN3fooIiED2Ev(%struct.foo* %this)
>> unnamed_addr #1 comdat $_ZN3fooIiED5Ev align 2 {
>> entry:
>>   %0 = bitcast %struct.foo* %this to %struct.bar*
>>   tail call void @_ZN3barD2Ev(%struct.bar* %0)
>>   ret void
>> }
>>
>> Instead of rauw of _ZN3fooIiED2Ev with _ZN3barD2Ev. We could teach
>> LLVM to do this, but we are not there yet.
>
>
> I think you can still RAUW; you just need to leave the derived dtor behind
> if you leave any part of its comdat behind.

Yes. Clang could be made a bit cleaver to do a RAUW of _ZN3fooIiED2Ev
and _ZN3fooIiED1Ev with _ZN3barD2Ev. It would require a bit more
refactoring. I will give it a try to see if it gets too complex.

>> So it is not clear if this is worth it having it upstream. What do you
>> guys think?
>
>
> Not sure yet.


Cheers,
Rafael



More information about the cfe-commits mailing list