[llvm-dev] DWARF: Reconstituting mangled names (& skipping DW_AT_linkage_name)

Reid Kleckner via llvm-dev llvm-dev at lists.llvm.org
Thu Jul 1 20:21:54 PDT 2021


It could work, but the long linkage names will still be present in .strtab,
so I wonder if it would make more sense to pursue a solution that addresses
both issues. I happen to know you were considering a separate proposal for
that, and I wonder if it could be used to solve this problem as well.
Either way, the debug info consumer must be taught to look up or
reconstitute the long mangled name.

I was thinking something like, "if symbol name is longer than X threshold,
replace it with _H${contenthash}, place the long name in a side table
section". Tools that are aware of the new convention can do the lookup in
the side table. Tools that are unaware will just produce funny names. The
DWARF linkage name would use the _H symbol, and consumers that care beyond
just having a unique linkage identifier can do the lookup.

There is prior art for this. MSVC caps linkage names at 4096, I believe,
and hashes the name down with MD5:
https://github.com/llvm/llvm-project/blob/main/clang/lib/AST/MicrosoftMangle.cpp#L53

On Thu, Jun 24, 2021 at 5:32 PM David Blaikie via llvm-dev <
llvm-dev at lists.llvm.org> wrote:

> In addition to simplifying template names (
> https://groups.google.com/g/llvm-dev/c/ekLMllbLIZg ) another case I've
> found in my use case is a lot of mangled names (in part because we build
> with -fdebug-info-for-profiling which turns on function linkage names even
> at -g1/-gmlt).
>
> So I was wondering if we could recreate linkage names from DWARF, rather
> than encoding them directly - and I have a prototype that seems to show
> this is possible (at least some simple cases - including some template
> cases).
>
> In the pathological case I'm looking at (lots of expression templates in
> TensorFlow) skipping linkage names in the cases I think we can reconstitute
> (but I haven't implemented the full logic and verified everything can be
> reconstituted) reduced .debug_str.dwo by 52% (and that composes/stacks with
> the 43% reduction from the simplified template names - for a 95% reduction
> in total) and in a large but less pathological binary it was 56% (in
> addition to 25% from the template names, still 80% reduction overall).
>
> Wondering if anyone's interested in this? Has
> thoughts/feelings/concerns/etc?
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210701/899d704b/attachment.html>


More information about the llvm-dev mailing list