[llvm-dev] [RFC] - Deduplication of debug information in linkers (LLD).
George Rimar via llvm-dev
llvm-dev at lists.llvm.org
Tue Dec 5 05:50:48 PST 2017
Thanks for answers, Paul !
>So, I think whether type sections help or hurt will depend on how a particular project's build procedure is set up. Clang/LLVM are set up >to do lots of smaller compilations and link them all together, in a fairly traditional model, and that is where type sections will provide the >most benefit. Your data, then, is essentially for a best-case scenario. Other kinds of projects will not benefit as much.
This inspired me to do additional tests for LLVM binaries to see how much win they can have if we enable -fdebug-types-section.
(Full table with results is at the end of mail.)
During experiment I observed both object size penalies and a single final executable size penalty:
1) Size of .a files in LLVM/lib files inreases from 6.5GB to 7.7GB.
2) One binary which is llvm-PerfectShuffle was larger with flag, size changed from 120064 to 124952.
For all others use of flag usually grants noticable win (up to reduce of size by 41%).
>Regarding DWARF 5 and emitting type sections into the .debug_info section rather than the .debug_types section: The work to support >DWARF 5 in LLVM has not gotten very far yet. Conforming to the standard in this respect is certainly on my list, however there are other >features that Sony considers higher priority. If you or someone else wants to contribute that feature sooner, that would be excellent! >Otherwise, we will get to it in due time.
>Thanks,
>--paulr
I am going to look at it closer. At least I do not think LLD would work correctly with multiple
.debug_info right now for building .gdb_index. We expect to see unique .debug_info in a object file and
probably will do something wrong in another case. Looks llvm/DebugInfo needs to be fixed first, which
also affects tools lile llvm-dwarfdump and probably something else. Going to investigate all of that.
Testing results:
----------------------------------------------------------------
Name Size change (-g / -g -fdebug-types-section)
----------------------------------------------------------------
arcmt-test 461644608 / 322758048 = 1.430x
bugpoint 938191280 / 552402624 = 1.698x
c-arcmt-test 20968 / 20968 = 1.000x
c-index-test 941325408 / 613643776 = 1.533x
clang-6.0 1697417824 / 1025908400 = 1.654x
clang-check 1440335448 / 864954472 = 1.665x
clang-diff 422183328 / 293650384 = 1.437x
clang-format 67763352 / 51596584 = 1.313x
clang-func-mapping 423746376 / 294311536 = 1.439x
clang-import-test 611477912 / 410019056 = 1.491x
clang-offload-bundler 76254024 / 61321152 = 1.243x
clang-refactor 448153976 / 311549496 = 1.438x
clang-rename 441661264 / 307777416 = 1.435x
clang-tblgen 17489504 / 16802744 = 1.040x
count 18392 / 18392 = 1.000x
diagtool 415912688 / 289701512 = 1.435x
FileCheck 6681280 / 6489896 = 1.029x
llc 903308048 / 529531864 = 1.705x
lld 1009754992 / 620445232 = 1.627x
lli 419682176 / 270912680 = 1.549x
lli-child-target 77237632 / 63011888 = 1.225x
llvm-ar 131787624 / 102692104 = 1.283x
llvm-as 72916752 / 57792456 = 1.261x
llvm-bcanalyzer 6464984 / 6259992 = 1.032x
llvm-cat 73318016 / 57999784 = 1.264x
llvm-cfi-verify 160259072 / 125738440 = 1.274x
llvm-config 5947768 / 5776752 = 1.029x
llvm-cov 80728632 / 65663448 = 1.229x
llvm-c-test 843631952 / 498768912 = 1.691x
llvm-cvtres 72163840 / 58065104 = 1.242x
llvm-cxxdump 74284720 / 59261168 = 1.253x
llvm-cxxfilt 7046752 / 6865368 = 1.026x
llvm-demangle-fuzzer 70156288 / 55760784 = 1.258x
llvm-diff 58551832 / 46506104 = 1.259x
llvm-dis 52982824 / 42252624 = 1.253x
llvm-dsymutil 883071928 / 517877728 = 1.705x
llvm-dwarfdump 121679064 / 95079960 = 1.279x
llvm-dwp 879362280 / 514570584 = 1.708x
llvm-extract 115790888 / 87646504 = 1.321x
llvm-isel-fuzzer 887217736 / 519910464 = 1.706x
llvm-link 79158192 / 62087976 = 1.274x
llvm-lto 932838656 / 553536912 = 1.685x
llvm-lto2 926319416 / 550018696 = 1.684x
llvm-mc 118139784 / 89656216 = 1.317x
llvm-mcmarkup 5974664 / 5775368 = 1.034x
llvm-modextract 68740776 / 54352208 = 1.264x
llvm-mt 6749720 / 6440088 = 1.048x
llvm-nm 131633536 / 102825080 = 1.280x
llvm-objcopy 73991272 / 60029840 = 1.232x
llvm-objdump 150270880 / 118629456 = 1.266x
llvm-opt-fuzzer 891258608 / 527493664 = 1.689x
llvm-opt-report 8814368 / 8585952 = 1.026x
llvm-pdbutil 110919744 / 93010704 = 1.192x
llvm-PerfectShuffle 120064 / 124952 = 0.960x
llvm-profdata 41889560 / 32957976 = 1.270x
llvm-rc 8954768 / 8551192 = 1.047x
llvm-readobj 85723040 / 70542776 = 1.215x
llvm-rtdyld 138255056 / 108085992 = 1.279x
llvm-size 71567376 / 57589872 = 1.259x
llvm-split 125299816 / 95063600 = 1.318x
llvm-stress 46366576 / 37211688 = 1.246x
llvm-strings 5746216 / 5563216 = 1.032x
llvm-symbolizer 87280568 / 71248216 = 1.225x
llvm-tblgen 49304088 / 42580848 = 1.157x
llvm-xray 93953928 / 77434112 = 1.213x
not 5495536 / 5325816 = 1.031x
obj2yaml 97146752 / 81415480 = 1.193x
opt 955386696 / 564492184 = 1.692x
sancov 146145680 / 114837520 = 1.272x
sanstats 87031832 / 71004312 = 1.225x
scan-build 53444 / 53444 = 1.000x
scan-view 4504 / 4504 = 1.000x
verify-uselistorder 73506560 / 58211520 = 1.262x
yaml2obj 27882712 / 26506184 = 1.051x
yaml-bench 7001024 / 6763952 = 1.035x
From: llvm-dev [mailto:llvm-dev-bounces at lists.llvm.org] On Behalf Of George Rimar via llvm-dev
Sent: Monday, December 04, 2017 7:11 AM
To: llvm-dev at lists.llvm.org
Subject: [llvm-dev] [RFC] - Deduplication of debug information in linkers (LLD).
Hi all !
We have an issue with LLD, it is "relocation R_X86_64_32 out of range" (PR31109)
which occurs during resolving relocations in debug sections. It looks happens
because .debug_info section can be too large sometimes and 32x relocation is not enough
to represent the value. One of possible solutions looks to be to deduplicate information
to reduce .debug_info size.
The rest of mail contains information about experiments I did, the obtained results and
some questions and suggestions as well.
I was investigating idea to deduplicate debug types information. Idea is described at
p276 of DWARF4 specification (http://www.dwarfstd.org/doc/DWARF4.pdf). It suggests
to split types information out of .debug_info and emit multiple .debug_types sections
with use of COMDATs. Both clang and gcc I tested implements -fdebug-types-section flag for that:
-fdebug-types-section, -fno-debug-types-section
Place debug types in their own section (ELF Only)
gcc's description is here: https://gcc.gnu.org/onlinedocs/gcc-6.4.0/gcc/Debugging-Options.html#Debugging-Options.
This flag is disabled by default. I compared clang binaries to see the difference
with and without the linker side optimisation.
1) Clang built with -g has size of 1.7 GB, .debug_info section size is 894.5 Mb.
2) Clang built with -g -fdebug-types-section has size of 1.0 GB.
.debug_types size is 26.267 MB, .debug_info size is 227.7 MB.
Difference is huge and I believe shows (though probably for most of readers here it was
already obvious) that optimization can be useful. Though -fdebug-types-section is disabled by default.
Looks it was initially disabled because not all of DWARF consumers were aware of .debug_types section.
Now in 2017 situation is different. I think most of DWARF consumers knows about .debug_types, but:
1) DWARF5 specification explicitly eliminates the .debug_types section introduced in DWARF4:
p8, "1.4 Changes from Version 4 to Version 5" http://dwarfstd.org/doc/DWARF5.pdf
2) Instead of emiting multiple .debug_types it suggests to emit multiple .debug_info COMDAT
sections. (p375, p376).
And it seems currently there is no way to make clang to emit multiple .debug_info with type information
like DWARF5 suggests. I tried command line below:
-g -fdebug-types-section -gdwarf-5
It still emits .debug_types and does not look there is a flag for emiting multiple .debug_info.
Looking at whole LLVM code (lib/mc, lib/CodeGen) actually it seems it is just always assumed .debug_info is
a unique section in object.
(also not sure why clang emits .debug_types when -gdwarf-5 flag is set, as this section is incompatible with v5,
probably it is a bug).
So my questions are following:
1) Do we want to try to implement multiple .debug_info approach ? As it seems can be very useful sometimes.
2) For now in LLD may be we may want to extend our error message from "relocation X out of range" to something
suggesting to use -fdebug-types-section (only for relocations in debug sections) ?
3) Why -fdebug-types-section is disabled by default ?
Best regards,
George | Developer | Access Softek, Inc
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20171205/3671e6a6/attachment-0001.html>
More information about the llvm-dev
mailing list