[llvm-dev] [RFC] - Deduplication of debug information in linkers (LLD).

George Rimar via llvm-dev llvm-dev at lists.llvm.org
Tue Dec 5 05:50:48 PST 2017


Thanks for answers, Paul !

>So, I think whether type sections help or hurt will depend on how a particular project's build procedure is set up.  Clang/LLVM are set up >to do lots of smaller compilations and link them all together, in a fairly traditional model, and that is where type sections will provide the >most benefit.  Your data, then, is essentially for a best-case scenario.  Other kinds of projects will not benefit as much.​

This inspired me to do additional tests for LLVM binaries to see how much win they can have if we enable -fdebug-types-section.
(Full table with results is at the end of mail.)
During experiment I observed both object size penalies and a single final executable size penalty:
1) Size of .a files in LLVM/lib files inreases from 6.5GB to 7.7GB.
2) One binary which is llvm-PerfectShuffle was larger with flag, size changed from 120064 to 124952​.
     For all others use of flag usually grants noticable win (up to reduce of size by 41%).

>Regarding DWARF 5 and emitting type sections into the .debug_info section rather than the .debug_types section:  The work to support >DWARF 5 in LLVM has not gotten very far yet.  Conforming to the standard in this respect is certainly on my list, however there are other >features that Sony considers higher priority.  If you or someone else wants to contribute that feature sooner, that would be excellent!  >Otherwise, we will get to it in due time.
>Thanks,
>--paulr

I am going to look at it closer. At least I do not think LLD would work correctly with multiple
.debug_info right now for building .gdb_index. We expect to see unique .debug_info in a object file and
probably will do something wrong in another case.  Looks llvm/DebugInfo needs to be fixed first, which
also affects tools lile llvm-dwarfdump and probably something else. Going to investigate all of that.

Testing results:​
----------------------------------------------------------------
Name                Size change (-g / -g -fdebug-types-section)
----------------------------------------------------------------
arcmt-test             461644608 /  322758048 = 1.430x
bugpoint               938191280 /  552402624 = 1.698x
c-arcmt-test               20968 /      20968 = 1.000x
c-index-test           941325408 /  613643776 = 1.533x
clang-6.0             1697417824 / 1025908400 = 1.654x
clang-check           1440335448 /  864954472 = 1.665x
clang-diff             422183328 /  293650384 = 1.437x
clang-format            67763352 /   51596584 = 1.313x
clang-func-mapping     423746376 /  294311536 = 1.439x
clang-import-test      611477912 /  410019056 = 1.491x
clang-offload-bundler   76254024 /   61321152 = 1.243x
clang-refactor         448153976 /  311549496 = 1.438x
clang-rename           441661264 /  307777416 = 1.435x
clang-tblgen           17489504  /   16802744 = 1.040x
count                     18392  /      18392 = 1.000x
diagtool              415912688  /  289701512 = 1.435x
FileCheck               6681280  /    6489896 = 1.029x
llc                   903308048  /  529531864 = 1.705x
lld                  1009754992  /  620445232 = 1.627x
lli                   419682176  /  270912680 = 1.549x
lli-child-target       77237632  /   63011888 = 1.225x
llvm-ar               131787624  /  102692104 = 1.283x
llvm-as                72916752  /   57792456 = 1.261x
llvm-bcanalyzer         6464984  /    6259992 = 1.032x
llvm-cat               73318016  /   57999784 = 1.264x
llvm-cfi-verify       160259072  /  125738440 = 1.274x
llvm-config             5947768  /    5776752 = 1.029x
llvm-cov               80728632  /   65663448 = 1.229x
llvm-c-test           843631952  /  498768912 = 1.691x
llvm-cvtres            72163840  /   58065104 = 1.242x
llvm-cxxdump           74284720  /   59261168 = 1.253x
llvm-cxxfilt            7046752  /    6865368 = 1.026x
llvm-demangle-fuzzer   70156288  /   55760784 = 1.258x
llvm-diff              58551832  /   46506104 = 1.259x
llvm-dis               52982824  /   42252624 = 1.253x
llvm-dsymutil         883071928  /  517877728 = 1.705x
llvm-dwarfdump        121679064  /   95079960 = 1.279x
llvm-dwp              879362280  /  514570584 = 1.708x
llvm-extract          115790888  /   87646504 = 1.321x
llvm-isel-fuzzer      887217736  /  519910464 = 1.706x
llvm-link              79158192  /   62087976 = 1.274x
llvm-lto              932838656  /  553536912 = 1.685x
llvm-lto2             926319416  /  550018696 = 1.684x
llvm-mc               118139784  /   89656216 = 1.317x
llvm-mcmarkup          5974664   /    5775368 = 1.034x
llvm-modextract       68740776   /   54352208 = 1.264x
llvm-mt                6749720   /    6440088 = 1.048x
llvm-nm              131633536   /  102825080 = 1.280x
llvm-objcopy          73991272   /   60029840 = 1.232x
llvm-objdump         150270880   /  118629456 = 1.266x
llvm-opt-fuzzer      891258608   /  527493664 = 1.689x
llvm-opt-report        8814368   /    8585952 = 1.026x
llvm-pdbutil         110919744   /   93010704 = 1.192x
llvm-PerfectShuffle     120064   /     124952 = 0.960x
llvm-profdata         41889560   /   32957976 = 1.270x
llvm-rc                8954768   /    8551192 = 1.047x
llvm-readobj          85723040   /   70542776 = 1.215x
llvm-rtdyld          138255056   /  108085992 = 1.279x
llvm-size             71567376   /   57589872 = 1.259x
llvm-split           125299816   /   95063600 = 1.318x
llvm-stress           46366576   /   37211688 = 1.246x
llvm-strings           5746216   /    5563216 = 1.032x
llvm-symbolizer       87280568   /   71248216 = 1.225x
llvm-tblgen           49304088   /   42580848 = 1.157x
llvm-xray             93953928   /   77434112 = 1.213x
not                    5495536   /    5325816 = 1.031x
obj2yaml              97146752   /   81415480 = 1.193x
opt                  955386696   /  564492184 = 1.692x
sancov               146145680   /  114837520 = 1.272x
sanstats              87031832   /   71004312 = 1.225x
scan-build               53444   /      53444 = 1.000x
scan-view                 4504   /       4504 = 1.000x
verify-uselistorder   73506560   /   58211520 = 1.262x
yaml2obj              27882712   /   26506184 = 1.051x
yaml-bench             7001024   /    6763952 = 1.035x



From: llvm-dev [mailto:llvm-dev-bounces at lists.llvm.org] On Behalf Of George Rimar via llvm-dev
Sent: Monday, December 04, 2017 7:11 AM
To: llvm-dev at lists.llvm.org
Subject: [llvm-dev] [RFC] - Deduplication of debug information in linkers (LLD).

Hi all !

We have an issue with LLD, it is  "relocation R_X86_64_32 out of range" (PR31109)
which occurs during resolving relocations in debug sections. It looks happens
because .debug_info section can be too large sometimes and 32x relocation is not enough
to represent the value. One of possible solutions looks to be to deduplicate information
to reduce .debug_info size.
The rest of mail contains information about experiments I did, the obtained results and
some questions and suggestions as well.

I was investigating idea to deduplicate debug types information. Idea is described at
p276 of DWARF4 specification (http://www.dwarfstd.org/doc/DWARF4.pdf). It suggests
to split types information out of .debug_info and emit multiple .debug_types sections
with use of COMDATs. Both clang and gcc I tested implements -fdebug-types-section flag for that:

-fdebug-types-section, -fno-debug-types-section
Place debug types in their own section (ELF Only)
gcc's description is here: https://gcc.gnu.org/onlinedocs/gcc-6.4.0/gcc/Debugging-Options.html#Debugging-Options.

This flag is disabled by default. I compared clang binaries to see the difference
with and without the linker side optimisation.
1) Clang built with -g has size of 1.7 GB, .debug_info section size is 894.5 Mb.
2) Clang built with -g -fdebug-types-section has size of 1.0 GB.
   .debug_types size is 26.267 MB, .debug_info size is 227.7 MB.

Difference is huge and I believe shows (though probably for most of readers here it was
already obvious) that optimization can be useful. Though -fdebug-types-section is disabled by default.
Looks it was initially disabled because not all of DWARF consumers were aware of .debug_types section.

Now in 2017 situation is different. I think most of DWARF consumers knows about .debug_types, but:
1) DWARF5 specification explicitly eliminates the .debug_types section introduced in DWARF4:
   p8, "1.4 Changes from Version 4 to Version 5" http://dwarfstd.org/doc/DWARF5.pdf
2) Instead of emiting multiple .debug_types it suggests to emit multiple .debug_info COMDAT
   sections. (p375, p376).

And it seems currently there is no way to make clang to emit multiple .debug_info with type information
like DWARF5 suggests. I tried command line below:
-g -fdebug-types-section -gdwarf-5
It still emits .debug_types and does not look there is a flag for emiting multiple .debug_info.
Looking at whole LLVM code (lib/mc, lib/CodeGen) actually it seems it is just always assumed .debug_info is
a unique section in object.
(also not sure why clang emits .debug_types when -gdwarf-5 flag is set, as this section is incompatible with v5,
probably it is a bug).

So my questions are following:
1) Do we want to try to implement multiple .debug_info approach ? As it seems can be very useful sometimes.
2) For now in LLD may be we may want to extend our error message from "relocation X out of range" to something
   suggesting to use -fdebug-types-section (only for relocations in debug sections) ?
3) Why -fdebug-types-section is disabled by default ?

​
Best regards,
George | Developer | Access Softek, Inc
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20171205/3671e6a6/attachment-0001.html>


More information about the llvm-dev mailing list