[llvm-dev] Fwd: Skipping names of temporary symbols increased size of ARM binaries.

Oleg Ranevskyy via llvm-dev llvm-dev at lists.llvm.org
Fri Sep 18 09:10:14 PDT 2015


CC llvm-dev

---------- Forwarded message ----------

Hello Duncan

The size of ARM binaries created by clang has increased after r236642.
Would you be able to find some time to look at my findings and share your
thoughts about the problem, please?

r236642 prevents emitting of temp label names into object files to save
memory. This is fine, the label names do not appear in the resulting
binaries. However, this creates some problems for the binutils linker,
which analyzes symbol names from the input object files and can decide to
skip some local compiler generated labels. Now it no longer sees the label
names and therefore puts them all into the final binary. I will demonstrate
this on an example.

If we compile the attached main.cpp file for ARM
    clang++ -c -o main.o -O0 -g --target=armv7l-linux-gnueabihf main.cpp
and then look at the symbols
    readelf -s main.o
there will be a number of similar entries (showing one entry only here for
conciseness):
    Num:    Value  Size Type       Bind   Vis      Ndx   Name
      7:  00000062    0 NOTYPE  LOCAL  DEFAULT    9
These are the .Linfo_string<index> symbols whose names are skipped due to
r236642.

If we now link it
    clang++ -o main.out --target=armv7l-linux-gnueabihf main.o
all the symbols get through to the final binary:
    readelf -s main.out

    Num:    Value     Size Type         Bind      Vis           Ndx Name
    73:   0000006e    0    NOTYPE  LOCAL  DEFAULT   32

The linker can't decide if the labels are the local ones to be left out not
seeing their names. Its bfd_is_local_label_name function returns false and
the labels are not skipped.

Before r236642 the names were inserted into object files:
    readelf -s main.o

    Num:    Value       Size  Type         Bind     Vis Ndx   Name
     23:    0000007a    0     NOTYPE  LOCAL  DEFAULT    9 .Linfo_string7

bfd_is_local_label_name returns true if the name starts with ".L" and the
symbol is skipped.

This is not critical for small projects but can create noticeable overhead
for the big ones.

Any help will be much appreciated.
Thank you.
Oleg
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150918/3916ddcf/attachment.html>
-------------- next part --------------
int get_index(int max_value)
{
    static unsigned long x=123456789;

    x ^= x << 16;

    return x % max_value;
}

int main()
{
  int values[10] = {7};
  values[get_index(10)] = 5;

  return 0;
}


More information about the llvm-dev mailing list