[llvm-commits] patch: pick direct or indirect strings in DWARF

Nick Lewycky nlewycky at google.com
Wed Oct 26 17:37:23 PDT 2011


New patch! Updates tests and adds new test. Tested on Linux and Darwin.

The previous patch looked like it worked on linux but didn't (readelf -w on
the .o file was fine, but it didn't have relocations so after linking the
debug info in the result was wrong). Darwin needs to not have relocations
for its string pool entries, while linux needs to have them.

Please review!

I have not tested this for size improvements in .o files, but I expect it to
help there too. If you have a function with an overload or that is
templated, the DW_AT_name will be repeated each time. This patch interns
those copies. For strings that only occur once, we pay an overhead of 4
bytes (unless the string is 4 bytes, in which case we provide it
immediately)

I have however looked at the size reduction of a few small real-world C++
programs at -g -O0 on Linux.

Before this patch:
Clang: 804,833,856 bytes
Program 1: 1,732,799,096 bytes
Program 2: 2,277,486,312 bytes
Program 3: 1,509,342,448 bytes

After this patch:
Clang: 417,182,256 bytes
Program 1: 935,406,928 bytes
Program 2: 1,239,206,568 bytes
Program 3: 825,010,104 bytes

So in my small sample, the new binaries are roughly 50-55% the size after
this patch is applied. Before we get too excited, I should also show gcc's
numbers:

Clang: 286,878,920 bytes
Program 1: 694,820,176 bytes
Program 2: 909,171,592 bytes
Program 3: 613,554,952 bytes

which suggests that we have further room to improve.

Nick

On 25 October 2011 01:11, Nick Lewycky <nicholas at mxc.ca> wrote:

> Updated patch, this one works on Darwin now too.
>
> I have tested this patch on Darwin in both -m32 and -m64 modes. The
> difference from the previous patch is that I now use DIEDelta to compute a
> section offset against the string pool, instead of DIELabel to refer to the
> string directly (hey, it worked for me on Linux).
>
> Please review!
>
> Nick
>
> Nick Lewycky wrote:
>
>> DWARF allows string to be specified in one of two ways, either by
>> writing them literally (DW_FORM_string) or by including a pointer into
>> the .debug_str section and putting the NUL-terminated string there.
>>
>> The attached patch removes the Form argument from CompileUnit::AddString
>> and changes addString to emit either a DIEString or a DIELabel depending
>> on how long the string is. If the string would fit in 4 bytes in the
>> direct encoding, do that. Otherwise, hoist it out into .debug_str so
>> that it can be interned.
>>
>> This is a major issue on linux where the linker does not turn direct
>> strings into indirect strings, but does merge the string tables. On
>> Darwin, the linker will turn direct strings into indirect strings as
>> needed. However, this change should probably be enabled on all platforms
>> as it generally makes .o files smaller.
>>
>> Please review! The one thing I don't like about this patch is that we
>> emit the bytes (via .ascii) and then emit the NUL (via. ".zero 1"). The
>> alternatives I see are either to create a copy of the string, or to
>> create a new emitBytesWithNUL API in MCStreamer.
>>
>> Nick
>>
>> PS. If this patch doesn't work on Darwin out-of-the-box, please try one
>> thing for me: comment out the change to DIELabel::SizeOf (adding a case
>> for DW_FORM_strp), and let me know whether that fixes things.
>>
>>
>>
>> ______________________________**_________________
>> llvm-commits mailing list
>> llvm-commits at cs.uiuc.edu
>> http://lists.cs.uiuc.edu/**mailman/listinfo/llvm-commits<http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits>
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20111026/8b259143/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: dwarf-indirect-string-4.patch
Type: text/x-patch
Size: 22829 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20111026/8b259143/attachment.bin>


More information about the llvm-commits mailing list