[llvm-dev] [DWARF] using simplified template names

David Blaikie via llvm-dev llvm-dev at lists.llvm.org
Wed Jun 23 15:06:25 PDT 2021

On Wed, Jun 23, 2021 at 1:14 PM <paul.robinson at sony.com> wrote:
> >> Oh, is there any consequence for deduplication in LTO?  Isn’t that name-based?
> > Should be OK - that's based on the fully mangled/linkage name of the type, which would be untouched by this.
> I’ve recently been reminded that type-unit signatures are hashes of the name, not using the standard-recommended algorithm of hashing the content; I tried to work out which name is actually used, but it’s buried deeper than I am comfortable excavating.  Can we make sure that hash is using either the name-with-parameters, or the linkage name, as the input string?  We don’t want “foo<int>” and “foo<float>” using the same type-unit signature!

Worth checking, but yeah, not a problem - we don't emit class linkage
names, so the only reason we carry the linkage name on types is for
ODR deduplicating during LTO linking, and also using it for type units
when those are enabled - the linkage name is stored in the
DICompositeType's "identifier" field - not something readily confused
with being guaranteed to be the linkage name nor used for
DW_AT_linkage_name, etc. Only used as a unique identifier. That won't
be touched.

As an aside: I do have another direction I'm interested in pursuing
that's related to linkage names, rather than the pretty names: We
could reduce the number of DW_AT_linkage_names we emit by
reconstituting linkage names in symbolizers instead (eg: if we see a
function called "f3" with a single "int" formal parameter and void
return type - we can reconstruct the linkage name for that function as
_Zf3iv or whatever it is).

On one particularly pathological case I'm looking at, the simplified
pretty template names is worth 43% reduction in the final dwp
.debug_str.dwo and a rough estimate on the linkage name (omitting
linkage names from most cases when Clang's building the IR - there are
certain kinds of template cases that are hard to reconstruct, but
others that are easy/do-able with our current DWARF) 52%, and combined
for 95% reduction in debug string size. (a less pathalogical case, one
of Google's largest binaries, it was 26%/56% for 82% total reduction)

More information about the llvm-dev mailing list