[PATCH] D54282: [DWARFv5] Emit normal type units in .debug_info comdats.

Fri Nov 9 10:38:50 PST 2018

dblaikie added inline comments.

================
Comment at: llvm/lib/MC/MCObjectFileInfo.cpp:822
+    return Ctx->getELFSection(
+        DwarfVersion <= 4 ? ".debug_types" : ".debug_info",
+        TT.isMIPS() ? ELF::SHT_MIPS_DWARF : ELF::SHT_PROGBITS,
----------------
probinson wrote:
> dblaikie wrote:
> > Seems odd to have a function for "DwarfTypesSection" return the debug_info section. I'd tend towards having MCObjectFileInfo be more literal than this - getDwarfTypesSection(Hash) and getDwarfInfoSection(Hash), etc.
> > 
> > (a very casual/far-off motivation, is that it'd be great if, eventually, debug_info came to allowing various different entities in comdat'd sections - not just types - this is what some of us have thrown around as "bag of DWARF")
> Excellent point, that would be a better division of responsibility.
> 
> i've had some discussions with Sony linker people about different tactics for subdividing DWARF, using comdats or other ELF tricks. Cost/benefit tradeoffs have not been clear. But worth pursuing.
Yeah, stuff Adrian, Eric, and I have thrown around goes in a few directions:

1) type units aren't super efficient (for GCC they were a 2x growth, roughly - for every byte that was removed from the CU, you got two bytes of TU - for Clang (only because it's a bit more conservative about which types to put in type units) it's about 1.5x) one major reason is that the CU can only reference the type itself - not all the other things, mostly the member function declarations would be great to reference ratehr than as it is today, where any member function definition in the CU references a declaration of that function in the stub type description in the CU, which is a lot of duplicate bytes.

-> So it'd be great if a type unit were enhanced to allow more than one entity to be referenced (have the TU header include a count N, then N x {hash, offset} pairs rather than the current header which has exactly one {hash, offset} pair)

2) Some data doesn't need to go in a separate unit - sometimes you know you're only emitting a type once, so would be cool to be able to emit it into the CU. But what about types that need to be in a type unit? I guess they could refer to the type by declaration - in which case there'd be no reason to refer back into the CU. But alternatively it could be done by having even the CU have a list of referencable entities... eh, maybe that's less useful.

But in any case, once you do (1) it's perhaps less important to refer to this thing as a type unit, but instead as a bag of referencable entities, whatever the producer wants to group up like that. (& yeah, might work welrl for inline function declarations, etc)

Aside: I actually might try to do the "types referenced from type units are referenced as declarations instead of using any signatures" because with modular debug info there's more chance you'll have types you know are only going in one place (even today we have some of those - with strong vtables you can know you won't put the type anywhere but there) so you'd want to avoid type unit overhead. But some type unit might want to refer to that type - obviously that type unit is going to be in multiple object files and only one of those object files has the first vtable-bound type, so the other copies of the type unit must be using declarations to refer to the vtable-bound type anyway, seems fine to do it in this case where the vtable-bound type is available anyway.

</ramble>

Repository:
  rL LLVM

https://reviews.llvm.org/D54282