[llvm-dev] Skipping names of temporary symbols increased size of ARM binaries.

Duncan P. N. Exon Smith via llvm-dev llvm-dev at lists.llvm.org
Fri Sep 18 16:16:33 PDT 2015


+rafael and pete, who worked on this with me, and a couple of debug info folks.

> On 2015-Sep-18, at 09:10, Oleg Ranevskyy <llvm.mail.list at gmail.com> wrote:
> 
> CC llvm-dev
> 
> ---------- Forwarded message ----------
> 
> Hello Duncan
> 
> The size of ARM binaries created by clang has increased after r236642.
> Would you be able to find some time to look at my findings and share your thoughts about the problem, please?
> 
> r236642 prevents emitting of temp label names into object files to save memory. This is fine, the label names do not appear in the resulting binaries. However, this creates some problems for the binutils linker, which analyzes symbol names from the input object files and can decide to skip some local compiler generated labels. Now it no longer sees the label names and therefore puts them all into the final binary. I will demonstrate this on an example.
> 
> If we compile the attached main.cpp file for ARM
>     clang++ -c -o main.o -O0 -g --target=armv7l-linux-gnueabihf main.cpp
> and then look at the symbols
>     readelf -s main.o
> there will be a number of similar entries (showing one entry only here for conciseness):
>     Num:    Value  Size Type       Bind   Vis      Ndx   Name
>       7:  00000062    0 NOTYPE  LOCAL  DEFAULT    9
> These are the .Linfo_string<index> symbols whose names are skipped due to r236642.
> 
> If we now link it
>     clang++ -o main.out --target=armv7l-linux-gnueabihf main.o
> all the symbols get through to the final binary:
>     readelf -s main.out
> 
>     Num:    Value     Size Type         Bind      Vis           Ndx Name
>     73:   0000006e    0    NOTYPE  LOCAL  DEFAULT   32
> 
> The linker can't decide if the labels are the local ones to be left out not seeing their names. Its bfd_is_local_label_name function returns false and the labels are not skipped.
> 
> Before r236642 the names were inserted into object files:
>     readelf -s main.o
> 
>     Num:    Value       Size  Type         Bind     Vis Ndx   Name
>      23:    0000007a    0     NOTYPE  LOCAL  DEFAULT    9 .Linfo_string7
> 
> bfd_is_local_label_name returns true if the name starts with ".L" and the symbol is skipped.
> 
> This is not critical for small projects but can create noticeable overhead for the big ones.

What's the linker really doing here?  Is this some form of GC, or is it trying to strip out debug info, or...?

I'm surprised it won't drop symbols that have no names if it's dropping local symbols.  Could this be an oversight in the linker?



More information about the llvm-dev mailing list