[llvm-dev] Fragmented DWARF
Alexey Lapshin via llvm-dev
llvm-dev at lists.llvm.org
Wed Nov 4 05:35:46 PST 2020
On 04.11.2020 15:28, James Henderson wrote:
> Hi Alexey,
>
> Thanks for taking a look at these. I noticed you set the
> --mark-live-pc value to a value other than 1 for the fragmented DWARF
> version. This will mean additional GC-ing will be done beyond the
> amount that --gc-sections will do, so unless you use the same value
> for the option for other versions, the result will not be comparable.
> (The option is purely there to experiment with the effects were
> different amounts of the input codebase to be considered dead). Would
> you be okay to run those figures again without the option specified?
Oh, mis-interpreted that option. Following are updated results:
1. llvm-strings:
source object files size: 381M.
fragmented source object files size: 451M(18% increase).
a. upstream version,
command line options: --gc-sections
binary size: 6,5M
compilation time: 0:00.13 sec
run-time memory: 111kb
b. "fragmented DWARF" version,
command line options: --gc-sections
binary size: 5,3M
compilation time: 0:00.11 sec
run-time memory: 125kb
c. DWARFLinker version,
command line options: --gc-sections --gc-debuginfo
binary size: 3,8M
compilation time: 0:00.33 sec
run-time memory: 141kb
d. DWARFLinker no-odr version,
command line options: --gc-sections --gc-debuginfo
--gc-debuginfo-no-odr
binary size: 4,3M
compilation time: 0:00.38 sec
run-time memory: 142kb
2. clang:
source object files size: 6,5G.
fragmented source object files size: 7,3G(13% increase).
a. upstream version,
command line options: --gc-sections
binary size: 1,5G
compilation time: 6 sec
run-time memory: 9.7G
b. "fragmented DWARF" version,
command line options: --gc-sections
binary size: 1,4G
compilation time: 8 sec
run-time memory: 12G
c. DWARFLinker version,
command line options: --gc-sections --gc-debuginfo
binary size: 836M
compilation time: 62 sec
run-time memory: 15G
d. DWARFLinker no-odr version,
command line options: --gc-sections --gc-debuginfo
--gc-debuginfo-no-odr
binary size: 1,3G
compilation time: 128 sec
run-time memory: 17G
Detailed size results:
1. a)
FILE SIZE VM SIZE
-------------- --------------
41.1% 2.64Mi 0.0% 0 .debug_info
24.9% 1.60Mi 0.0% 0 .debug_str
12.6% 827Ki 0.0% 0 .debug_line
6.5% 428Ki 63.8% 428Ki .text
4.8% 317Ki 0.0% 0 .strtab
3.4% 223Ki 0.0% 0 .debug_ranges
2.0% 133Ki 19.8% 133Ki .eh_frame
1.7% 110Ki 0.0% 0 .symtab
1.2% 77.6Ki 0.0% 0 .debug_abbrev
b)
FILE SIZE VM SIZE
-------------- --------------
40.2% 2.10Mi 0.0% 0 .debug_info
30.7% 1.60Mi 0.0% 0 .debug_str
8.0% 428Ki 63.8% 428Ki .text
5.9% 317Ki 0.0% 0 .strtab
5.9% 313Ki 0.0% 0 .debug_line
2.5% 133Ki 19.8% 133Ki .eh_frame
2.1% 110Ki 0.0% 0 .symtab
1.5% 77.6Ki 0.0% 0 .debug_abbrev
1.3% 69.2Ki 0.0% 0 .debug_ranges
c)
FILE SIZE VM SIZE
-------------- --------------
33.0% 1.25Mi 0.0% 0 .debug_info
29.2% 1.11Mi 0.0% 0 .debug_str
11.0% 428Ki 63.8% 428Ki .text
8.2% 317Ki 0.0% 0 .strtab
7.8% 304Ki 0.0% 0 .debug_line
3.4% 133Ki 19.8% 133Ki .eh_frame
2.8% 110Ki 0.0% 0 .symtab
1.7% 65.9Ki 0.0% 0 .debug_ranges
1.0% 38.4Ki 5.7% 38.4Ki .rodata
d)
FILE SIZE VM SIZE
-------------- --------------
39.7% 1.68Mi 0.0% 0 .debug_info
26.3% 1.11Mi 0.0% 0 .debug_str
9.9% 428Ki 63.8% 428Ki .text
7.3% 317Ki 0.0% 0 .strtab
7.0% 304Ki 0.0% 0 .debug_line
3.1% 133Ki 19.8% 133Ki .eh_frame
2.6% 110Ki 0.0% 0 .symtab
1.5% 65.9Ki 0.0% 0 .debug_ranges
2. a)
FILE SIZE VM SIZE
-------------- --------------
58.3% 878Mi 0.0% 0 .debug_info
11.8% 177Mi 0.0% 0 .debug_str
7.7% 115Mi 62.2% 115Mi .text
7.7% 115Mi 0.0% 0 .debug_line
6.0% 90.7Mi 0.0% 0 .strtab
2.4% 35.4Mi 0.0% 0 .debug_ranges
1.5% 23.3Mi 12.5% 23.3Mi .eh_frame
1.5% 23.0Mi 12.4% 23.0Mi .rodata
1.2% 17.9Mi 0.0% 0 .symtab
b)
FILE SIZE VM SIZE
-------------- --------------
59.6% 807Mi 0.0% 0 .debug_info
13.1% 177Mi 0.0% 0 .debug_str
8.5% 115Mi 62.2% 115Mi .text
6.7% 90.7Mi 0.0% 0 .strtab
4.2% 57.4Mi 0.0% 0 .debug_line
1.7% 23.3Mi 12.5% 23.3Mi .eh_frame
1.7% 23.0Mi 12.4% 23.0Mi .rodata
1.3% 17.9Mi 0.0% 0 .symtab
1.0% 13.0Mi 0.0% 0 .debug_ranges
0.8% 10.6Mi 5.7% 10.6Mi .dynstr
c)
FILE SIZE VM SIZE
-------------- --------------
35.1% 293Mi 0.0% 0 .debug_info
21.2% 177Mi 0.0% 0 .debug_str
13.9% 115Mi 62.2% 115Mi .text
10.9% 90.7Mi 0.0% 0 .strtab
6.9% 57.4Mi 0.0% 0 .debug_line
2.8% 23.3Mi 12.5% 23.3Mi .eh_frame
2.8% 23.0Mi 12.4% 23.0Mi .rodata
2.1% 17.9Mi 0.0% 0 .symtab
1.5% 12.4Mi 0.0% 0 .debug_ranges
1.3% 10.6Mi 5.7% 10.6Mi .dynstr
d)
FILE SIZE VM SIZE
-------------- --------------
58.3% 758Mi 0.0% 0 .debug_info
13.6% 177Mi 0.0% 0 .debug_str
8.9% 115Mi 62.2% 115Mi .text
7.0% 90.7Mi 0.0% 0 .strtab
4.4% 57.4Mi 0.0% 0 .debug_line
1.8% 23.3Mi 12.5% 23.3Mi .eh_frame
1.8% 23.0Mi 12.4% 23.0Mi .rodata
1.4% 17.9Mi 0.0% 0 .symtab
1.0% 12.4Mi 0.0% 0 .debug_ranges
0.8% 10.6Mi 5.7% 10.6Mi .dynstr
>
> I'm still trying to figure out the problems on my end to try running
> your experiment on the game package I used in my presentation, but
> have been interrupted by other unrelated issues. I'll try to get back
> to this in the coming days.
>
> James
>
> On Wed, 4 Nov 2020 at 11:54, Alexey Lapshin <avl.lapshin at gmail.com
> <mailto:avl.lapshin at gmail.com>> wrote:
>
> Hi James,
>
> I did experiments with the clang code base and will do experiments
> with our local codebase later.
> Overall, both solutions("Fragmented DWARF" and "DWARFLinker
> without odr types deduplication") look having similar size savings
> results for the final binary. "DWARFLinker with odr types
> deduplication" has a bigger size saving effect. "Fragmented DWARF"
> increases the size of original object files up to 15%.
> LLD with "fragmented DWARF" works significantly faster than with
> "DWARFLinker".
>
> Following are the results for "llvm-strings" and "clang" binaries:
>
> 1. llvm-strings:
>
> source object files size: 381M.
> fragmented source object files size: 451M(18% increase).
>
> a. upstream version,
> command line options: --gc-sections
> binary size: 6,5M
> compilation time: 0:00.13 sec
> run-time memory: 111kb
>
> b. "fragmented DWARF" version,
> command line options: --gc-sections --mark-live-pc=0.45
> binary size: 3,7M
> compilation time: 0:00.10 sec
> run-time memory: 122kb
>
> c. DWARFLinker version,
> command line options: --gc-sections --gc-debuginfo
> binary size: 3,8M
> compilation time: 0:00.33 sec
> run-time memory: 141kb
>
> d. DWARFLinker no-odr version,
> command line options: --gc-sections --gc-debuginfo
> --gc-debuginfo-no-odr
> binary size: 4,3M
> compilation time: 0:00.38 sec
> run-time memory: 142kb
>
>
> 2. clang:
>
> source object files size: 6,5G.
> fragmented source object files size: 7,3G(13% increase).
>
> a. upstream version,
> command line options: --gc-sections
> binary size: 1,5G
> compilation time: 6 sec
> run-time memory: 9.7G
>
> b. "fragmented DWARF" version,
> command line options: --gc-sections --mark-live-pc=0.43
> binary size: 1,1G
> compilation time: 9 sec
> run-time memory: 11G
>
> c. DWARFLinker version,
> command line options: --gc-sections --gc-debuginfo
> binary size: 836M
> compilation time: 62 sec
> run-time memory: 15G
>
> d. DWARFLinker no-odr version,
> command line options: --gc-sections --gc-debuginfo
> --gc-debuginfo-no-odr
> binary size: 1,3G
> compilation time: 128 sec
> run-time memory: 17G
>
> Detailed size results:
>
> 1. llvm-strings
>
> a)
>
> FILE SIZE VM SIZE
> -------------- --------------
> 41.1% 2.64Mi 0.0% 0 .debug_info
> 24.9% 1.60Mi 0.0% 0 .debug_str
> 12.6% 827Ki 0.0% 0 .debug_line
> 6.5% 428Ki 63.8% 428Ki .text
> 4.8% 317Ki 0.0% 0 .strtab
> 3.4% 223Ki 0.0% 0 .debug_ranges
> 2.0% 133Ki 19.8% 133Ki .eh_frame
> 1.7% 110Ki 0.0% 0 .symtab
> 1.2% 77.6Ki 0.0% 0 .debug_abbrev
>
> b)
>
> FILE SIZE VM SIZE
> -------------- --------------
> 50.3% 1.85Mi 0.0% 0 .debug_info
> 43.6% 1.60Mi 0.0% 0 .debug_str
> 2.6% 98.2Ki 0.0% 0 .debug_line
> 2.1% 77.6Ki 0.0% 0 .debug_abbrev
> 0.5% 17.5Ki 54.9% 17.4Ki .text
> 0.3% 9.94Ki 0.0% 0 .strtab
> 0.2% 6.27Ki 0.0% 0 .symtab
> 0.1% 5.09Ki 15.9% 5.03Ki .eh_frame
> 0.1% 3.28Ki 0.0% 0 .debug_ranges
>
> c)
>
> FILE SIZE VM SIZE
> -------------- --------------
> 33.0% 1.25Mi 0.0% 0 .debug_info
> 29.2% 1.11Mi 0.0% 0 .debug_str
> 11.0% 428Ki 63.8% 428Ki .text
> 8.2% 317Ki 0.0% 0 .strtab
> 7.8% 304Ki 0.0% 0 .debug_line
> 3.4% 133Ki 19.8% 133Ki .eh_frame
> 2.8% 110Ki 0.0% 0 .symtab
> 1.7% 65.9Ki 0.0% 0 .debug_ranges
> 1.0% 38.4Ki 5.7% 38.4Ki .rodata
>
> d)
>
> FILE SIZE VM SIZE
> -------------- --------------
> 39.7% 1.68Mi 0.0% 0 .debug_info
> 26.3% 1.11Mi 0.0% 0 .debug_str
> 9.9% 428Ki 63.8% 428Ki .text
> 7.3% 317Ki 0.0% 0 .strtab
> 7.0% 304Ki 0.0% 0 .debug_line
> 3.1% 133Ki 19.8% 133Ki .eh_frame
> 2.6% 110Ki 0.0% 0 .symtab
> 1.5% 65.9Ki 0.0% 0 .debug_ranges
>
>
> 2. clang
>
> a)
>
> FILE SIZE VM SIZE
> -------------- --------------
> 58.3% 878Mi 0.0% 0 .debug_info
> 11.8% 177Mi 0.0% 0 .debug_str
> 7.7% 115Mi 62.2% 115Mi .text
> 7.7% 115Mi 0.0% 0 .debug_line
> 6.0% 90.7Mi 0.0% 0 .strtab
> 2.4% 35.4Mi 0.0% 0 .debug_ranges
> 1.5% 23.3Mi 12.5% 23.3Mi .eh_frame
> 1.5% 23.0Mi 12.4% 23.0Mi .rodata
> 1.2% 17.9Mi 0.0% 0 .symtab
>
> b)
>
> FILE SIZE VM SIZE
> -------------- --------------
> 71.5% 772Mi 0.0% 0 .debug_info
> 16.5% 177Mi 0.0% 0 .debug_str
> 3.7% 40.2Mi 59.2% 40.2Mi .text
> 2.4% 25.8Mi 0.0% 0 .debug_line
> 2.1% 23.0Mi 0.0% 0 .strtab
> 1.0% 10.6Mi 15.6% 10.6Mi .dynstr
> 0.7% 7.18Mi 10.6% 7.18Mi .eh_frame
> 0.5% 5.60Mi 0.0% 0 .symtab
> 0.4% 4.28Mi 0.0% 0 .debug_ranges
> 0.4% 4.04Mi 0.0% 0 .debug_abbrev
>
>
> c)
>
> FILE SIZE VM SIZE
> -------------- --------------
> 35.1% 293Mi 0.0% 0 .debug_info
> 21.2% 177Mi 0.0% 0 .debug_str
> 13.9% 115Mi 62.2% 115Mi .text
> 10.9% 90.7Mi 0.0% 0 .strtab
> 6.9% 57.4Mi 0.0% 0 .debug_line
> 2.8% 23.3Mi 12.5% 23.3Mi .eh_frame
> 2.8% 23.0Mi 12.4% 23.0Mi .rodata
> 2.1% 17.9Mi 0.0% 0 .symtab
> 1.5% 12.4Mi 0.0% 0 .debug_ranges
> 1.3% 10.6Mi 5.7% 10.6Mi .dynstr
>
> d)
>
> FILE SIZE VM SIZE
> -------------- --------------
> 58.3% 758Mi 0.0% 0 .debug_info
> 13.6% 177Mi 0.0% 0 .debug_str
> 8.9% 115Mi 62.2% 115Mi .text
> 7.0% 90.7Mi 0.0% 0 .strtab
> 4.4% 57.4Mi 0.0% 0 .debug_line
> 1.8% 23.3Mi 12.5% 23.3Mi .eh_frame
> 1.8% 23.0Mi 12.4% 23.0Mi .rodata
> 1.4% 17.9Mi 0.0% 0 .symtab
> 1.0% 12.4Mi 0.0% 0 .debug_ranges
> 0.8% 10.6Mi 5.7% 10.6Mi .dynstr
>
> Thank you, Alexey.
>
> On 19.10.2020 11:50, James Henderson wrote:
>> Great, thanks Alexey! I'll try to take a look at this in the near
>> future, and will report my results back here. I imagine our clang
>> results will differ, purely because we probably used different
>> toolchains to build the input in the first place.
>>
>> On Thu, 15 Oct 2020 at 10:08, Alexey Lapshin
>> <avl.lapshin at gmail.com <mailto:avl.lapshin at gmail.com>> wrote:
>>
>>
>> On 13.10.2020 10:20, James Henderson wrote:
>>> The script included in the patch can be used to convert an
>>> object containing normal DWARF into an object using
>>> fragmented DWARF. It does this by using llvm-dwarfdump to
>>> dump the various sections, parses the output to identify
>>> where it should split (using the offsets of the various
>>> entries), and then writes new section headers accordingly -
>>> you can see roughly what it's doing if you get a chance to
>>> watch the talk recording. The additional section headers are
>>> appended to the end of the ELF section header table, whilst
>>> the original DWARF is left in the same place it was before
>>> (making use of the fact that section headers don't have to
>>> appear in offset order). The script also parses and
>>> fragments the relocation sections targeting the DWARF
>>> sections so that they match up with the fragmented DWARF
>>> sections. This is clearly all suboptimal - in practice the
>>> compiler should be modified to do the fragmenting upfront,
>>> to save having to parse a tool's stdout, but that was just
>>> the simplest thing I could come up with to quickly write the
>>> script. Full details of the script usage are included in the
>>> patch description, if you want to play around with it.
>>>
>>> If Alexey could point me at the latest version of his patch,
>>> I'd be happy to run that through either or both of the
>>> packages I used to see what happens. Equally, I'd be happy
>>> if Alexey is able to run my script to fragment and measure
>>> the performance of a couple of projects he's been working
>>> with. Based purely on the two packages I've tried this with,
>>> I can tell already that the results can vary wildly. My
>>> expectation is that Alexey's approach will be slower (at
>>> least in its current form, but probably more generally), but
>>> produce smaller output, but to what scale I have no idea.
>>
>> James, I updated the patch - https://reviews.llvm.org/D74169.
>>
>> To make it working it is necessary to build example with
>> -ffunction-sections and specify following options to the linker :
>>
>> --gc-sections --gc-debuginfo --gc-debuginfo-no-odr
>>
>> For clang binary I got following results:
>>
>> 1. --gc-sections = binary size 1,5G, Debug Info size(*)1.2G
>>
>> 2. --gc-sections --gc-debuginfo = binary size 840M, 8x
>> performance decrease, Debug Info size 542M
>>
>> 3. --gc-sections --gc-debuginfo --gc-debuginfo-no-odr =
>> binary size 1,3G, 16x performance decrease, Debug Info size 1G
>>
>> (*) .debug_info+.debug_str+.debug_line+.debug_ranges+.debug_loc
>>
>>
>> I added option --gc-debuginfo-no-odr, so that size reduction
>> could be compared correctly. Without that option D74169 does
>> types deduplication and then it is not correct to compare
>> resulting size with "Fragmented DWARF" solution which does
>> not do types deduplication.
>>
>> Also, I look at your D89229 <https://reviews.llvm.org/D89229>
>> and would share results some time later.
>>
>> Thank you, Alexey.
>>
>>>
>>> I think linkers parse .eh_frame partly because they have no
>>> other choice. That being said, I think it's format is not
>>> too complex, so similarly the parser isn't too complex. You
>>> can see LLD's ELF implementation in ELF/EhFrame.cpp, how it
>>> is used in ELF/InputSection.cpp (see the bits to do with
>>> EhInputSection) and EhFrameSection in
>>> ELF/SyntheticSections.h (plus various usages of these two
>>> throughout the LLD code). I think the key to any structural
>>> changes in the DWARF format to make them more amenable to
>>> link-time parsing is being able to read a minimal amount
>>> without needing to parse the payload (e.g. a length field,
>>> some sort of type, and then using the relocations to
>>> associate it accordingly).
>>>
>>> James
>>>
>>> On Mon, 12 Oct 2020 at 20:48, David Blaikie
>>> <dblaikie at gmail.com <mailto:dblaikie at gmail.com>> wrote:
>>>
>>> Awesome! Sorry I missed the lightning talk, but really
>>> interested to see this sort of thing (though it's not
>>> directly/immediately applicable to the use case I work
>>> with - Split DWARF, something similar could be used
>>> there with further work)
>>>
>>> Though it looks like the patch has mostly linker changes
>>> - where/how do you generate the fragmented DWARF to
>>> begin with? Via the Python script? Run over assembly?
>>> I'd be surprised if it was achievable that way - curious
>>> to know more.
>>>
>>> Got a rough sense/are you able to run apples-to-apples
>>> comparisons with Alexey's linker-based patches to
>>> compare linker time/memory overhead versus resulting
>>> output size gains?
>>>
>>> (& yeah, I'm a bit curious about how the linkers do
>>> eh_frame rewriting, if the format is especially amenable
>>> to a lightweight parsing/rewriting and how we could make
>>> the DWARF more amenable to that too)
>>>
>>> On Mon, Oct 12, 2020 at 6:41 AM James Henderson
>>> <jh7370.2008 at my.bristol.ac.uk
>>> <mailto:jh7370.2008 at my.bristol.ac.uk>> wrote:
>>>
>>> Hi all,
>>>
>>> At the recent LLVM developers' meeting, I presented
>>> a lightning talk on an approach to reduce the amount
>>> of dead debug data left in an executable following
>>> operations such as --gc-sections and duplicate
>>> COMDAT removal. In that presentation, I presented
>>> some figures based on linking a game that had been
>>> built by our downstream clang port and fragmented
>>> using the described approach. Since recording the
>>> presentation, I ran the same experiment on a clang
>>> package (this time built with a GCC version). The
>>> comparable figures are below:
>>>
>>> Link-time speed (s):
>>> +--------------------+-------+---------------+------+------+------+------+------+
>>> | Package variant | No GC | GC 1 (normal) | GC 2
>>> | GC 3 | GC 4 | GC 5 | GC 6 |
>>> +--------------------+-------+---------------+------+------+------+------+------+
>>> | Game (plain) | 4.5 | 4.9 | 4.2
>>> | 3.6 | 3.4 | 3.3 | 3.2 |
>>> | Game (fragmented) | 11.1 | 11.8 | 9.7 | 8.6
>>> | 7.9 | 7.7 | 7.5 |
>>> | Clang (plain) | 13.9 | 17.9 | 17.0 | 16.7 |
>>> 16.3 | 16.2 | 16.1 |
>>> | Clang (fragmented) | 18.6 | 22.8 | 21.6 | 21.1 |
>>> 20.8 | 20.5 | 20.2 |
>>> +--------------------+-------+---------------+------+------+------+------+------+
>>>
>>> Output size - Game package (MB):
>>> +---------------------+-------+------+------+------+------+------+------+
>>> | Category | No GC | GC 1 | GC 2 | GC 3 |
>>> GC 4 | GC 5 | GC 6 |
>>> +---------------------+-------+------+------+------+------+------+------+
>>> | Plain (total) | 1149 | 1121 | 1017 | 965
>>> | 938 | 930 | 928 |
>>> | Plain (DWARF*) | 845 | 845 | 845 | 845
>>> | 845 | 845 | 845 |
>>> | Plain (other) | 304 | 276 | 172 | 120
>>> | 93 | 85 | 82 |
>>> | Fragmented (total) | 1044 | 940 | 556 | 373
>>> | 287 | 263 | 255 |
>>> | Fragmented (DWARF*) | 740 | 664 | 384 | 253
>>> | 194 | 178 | 173 |
>>> | Fragmented (other) | 304 | 276 | 172 | 120
>>> | 93 | 85 | 82 |
>>> +---------------------+-------+------+------+------+------+------+------+
>>>
>>>
>>> Output size - Clang (MB):
>>> +---------------------+-------+------+------+------+------+------+------+
>>> | Category | No GC | GC 1 | GC 2 | GC 3 |
>>> GC 4 | GC 5 | GC 6 |
>>> +---------------------+-------+------+------+------+------+------+------+
>>> | Plain (total) | 2596 | 2546 | 2406 | 2332 |
>>> 2293 | 2273 | 2251 |
>>> | Plain (DWARF*) | 1979 | 1979 | 1979 | 1979 |
>>> 1979 | 1979 | 1979 |
>>> | Plain (other) | 616 | 567 | 426 | 353
>>> | 314 | 294 | 272 |
>>> | Fragmented (total) | 2397 | 2346 | 2164 | 2069 |
>>> 2017 | 1990 | 1963 |
>>> | Fragmented (DWARF*) | 1780 | 1780 | 1738 | 1716 |
>>> 1703 | 1696 | 1691 |
>>> | Fragmented (other) | 616 | 567 | 426 | 353
>>> | 314 | 294 | 272 |
>>> +---------------------+-------+------+------+------+------+------+------+
>>>
>>> *DWARF size == total size of .debug_info +
>>> .debug_line + .debug_ranges + .debug_aranges +
>>> .debug_loc
>>>
>>> Additionally, I have posted
>>> https://reviews.llvm.org/D89229 which provides the
>>> python script and linker patches used to reproduce
>>> the above results on my machine. The GC 1/2/3/4/5/6
>>> correspond to the linker option added in that patch
>>> --mark-live-pc with values 1/0.8/0.6/0.4/0.2/0
>>> respectively.
>>>
>>> During the conference, the question was asked what
>>> the memory usage and input size impact was. I've
>>> summarised these below:
>>>
>>> Input file size total (GB):
>>> +--------------------+------------+
>>> | Package variant | Total Size |
>>> +--------------------+------------+
>>> | Game (plain) | 2.9 |
>>> | Game (fragmented) | 4.2 |
>>> | Clang (plain) | 10.9 |
>>> | Clang (fragmented) | 12.3 |
>>> +--------------------+------------+
>>>
>>> Peak Working Set Memory usage (GB):
>>> +--------------------+-------+------+
>>> | Package variant | No GC | GC 1 |
>>> +--------------------+-------+------+
>>> | Game (plain) | 4.3 | 4.7 |
>>> | Game (fragmented) | 8.9 | 8.6 |
>>> | Clang (plain) | 15.7 | 15.6 |
>>> | Clang (fragmented) | 19.4 | 19.2 |
>>> +--------------------+-------+------+
>>>
>>> I'm keen to hear what people's feedback is, and also
>>> interested to see what results others might see by
>>> running this experiment on other input packages.
>>> Also, if anybody has any alternative ideas that meet
>>> the goals listed below, I'd love to hear them!
>>>
>>> To reiterate some key goals of fragmented DWARF,
>>> similar to what I said in the presentation:
>>> 1) Devise a scheme that gives significant size
>>> savings without being too costly. It's clear from
>>> just the two packages I've tried this on that there
>>> is a fairly hefty link time performance cost,
>>> although the exact cost depends on the nature of the
>>> input package. On the other hand, depending on the
>>> nature of the input package, there can also be some
>>> big gains.
>>> 2) Devise a scheme that doesn't require any linker
>>> knowledge of DWARF. The current approach doesn't
>>> quite achieve this properly due to the slight misuse
>>> of SHF_LINK_ORDER, but I expect that a pivot to
>>> using non-COMDAT group sections should solve this
>>> problem.
>>> 3) Provide some kind of halfway house between simply
>>> writing tombstone values into dead DWARF and fully
>>> parsing the DWARF to reoptimise its/discard the dead
>>> bits.
>>>
>>> I'm hopeful that changes could be made to the linker
>>> to improve the link-time cost. There seems to be a
>>> significant amount of the link time spent creating
>>> the input sections. An alternative would be to
>>> devise a scheme that would avoid the literal
>>> splitting into section headers, in favour of some
>>> sort of list of split-points that the linker uses to
>>> split things up (a bit like it already does for
>>> .eh_frame or mergeable sections).
>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20201104/0f52a5d2/attachment-0001.html>
More information about the llvm-dev
mailing list