<div dir="ltr">One possibility is to make the reference to the linkage name an indirection into strtab proper rather than .debug_strtab. There are issues with stripping and such when that is done, but then you only have one copy between the two uses.</div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Thu, Jul 1, 2021 at 8:22 PM Reid Kleckner via llvm-dev <<a href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr">It could work, but the long linkage names will still be present in .strtab, so I wonder if it would make more sense to pursue a solution that addresses both issues. I happen to know you were considering a separate proposal for that, and I wonder if it could be used to solve this problem as well. Either way, the debug info consumer must be taught to look up or reconstitute the long mangled name.<div><br></div><div>I was thinking something like, "if symbol name is longer than X threshold, replace it with _H${contenthash}, place the long name in a side table section". Tools that are aware of the new convention can do the lookup in the side table. Tools that are unaware will just produce funny names. The DWARF linkage name would use the _H symbol, and consumers that care beyond just having a unique linkage identifier can do the lookup.</div><div><br></div><div>There is prior art for this. MSVC caps linkage names at 4096, I believe, and hashes the name down with MD5:</div><div><a href="https://github.com/llvm/llvm-project/blob/main/clang/lib/AST/MicrosoftMangle.cpp#L53" target="_blank">https://github.com/llvm/llvm-project/blob/main/clang/lib/AST/MicrosoftMangle.cpp#L53</a></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Thu, Jun 24, 2021 at 5:32 PM David Blaikie via llvm-dev <<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr">In addition to simplifying template names ( <a href="https://groups.google.com/g/llvm-dev/c/ekLMllbLIZg" target="_blank">https://groups.google.com/g/llvm-dev/c/ekLMllbLIZg</a> ) another case I've found in my use case is a lot of mangled names (in part because we build with -fdebug-info-for-profiling which turns on function linkage names even at -g1/-gmlt).<br><br>So I was wondering if we could recreate linkage names from DWARF, rather than encoding them directly - and I have a prototype that seems to show this is possible (at least some simple cases - including some template cases).<br><br>In the pathological case I'm looking at (lots of expression templates in TensorFlow) skipping linkage names in the cases I think we can reconstitute (but I haven't implemented the full logic and verified everything can be reconstituted) reduced .debug_str.dwo by 52% (and that composes/stacks with the 43% reduction from the simplified template names - for a 95% reduction in total) and in a large but less pathological binary it was 56% (in addition to 25% from the template names, still 80% reduction overall). <br><br>Wondering if anyone's interested in this? Has thoughts/feelings/concerns/etc?</div>
_______________________________________________<br>
LLVM Developers mailing list<br>
<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a><br>
<a href="https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" rel="noreferrer" target="_blank">https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a><br>
</blockquote></div>
_______________________________________________<br>
LLVM Developers mailing list<br>
<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a><br>
<a href="https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" rel="noreferrer" target="_blank">https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a><br>
</blockquote></div>