[llvm-dev] [ELF] String literals don't obey -fdata-sections

Fangrui Song via llvm-dev llvm-dev at lists.llvm.org
Tue Sep 15 15:18:56 PDT 2020


On 2020-09-15, Gaƫl Jobin via llvm-dev wrote:
>Hi there,
>
>When I compile my code with -fdata-sections and -ffunction-sections, I
>still see some unused string in my shared library (Android). Actually,
>the strings appear together inside a .rodata.str1.1 section instead of
>getting their own section. It seems that the C-string literal are
>considered differently than other constant and the -fdata-sections is
>not respected in
>https://github.com/llvm/llvm-project/blob/master/llvm/lib/CodeGen/TargetLoweringObjectFileImpl.cpp#L799.
>[1]  I came across the following GCC bug
>https://gcc.gnu.org/bugzilla/show_bug.cgi?id=192 where they have fixed
>the issue back in 2015. Any reason not to do so in LLVM?

Usually it is because nobody has noticed the problem or nobody is
motivated enough to fix the problems, not that they intentionally leave
a problem open:) I took some time to look at the problem and conclude
that clang should do nothing on this. Actually, with the clang behavior,
you can discard "Unused" if you use LLD. Read on.

>My code example:
>- static library 1 : expose functions api1() and api3()
>
>>#include "lib1.h"
>>
>>static char *test = "Test";
>>static char *unused = "Unused";
>>
>>void api1(){
>>printf(test);
>>}
>>
>>void api3(){
>>printf(unused);
>>}
>
>- shared library : use only function api1() from static library 1
>
>>#include "lib1.h"
>>
>>void test(){
>>api1();
>>}
>
>Both compiled with "-fdata-sections -ffunction-sections
>-fvisibility=hidden" and linked with "--gc-sections".
>
>While the api3() function is correctly gone, the result for the C-string
>is the following (in Hopper):
>
>>; Section .rodata.str1.1
>>
>>; Range: [0x63; 0x6f[ (12 bytes)
>>
>>; File offset : [151; 163[ (12 bytes)
>>
>>; Flags: 0x32
>>
>>;   SHT_PROGBITS
>>
>>;   SHF_ALLOC
>>
>>.L.str:
>>
>>00000063         db         "Test", 0
>>
>>.L.str.1:
>>
>>00000068         db         "Unused", 0
>
>
>
>Links:
>------
>[1]
>https://github.com/llvm/llvm-project/blob/master/llvm/lib/CodeGen/TargetLoweringObjectFileImpl.cpp#L799

In GCC, -O turns on -fmerge-constants. Clang does not implement this
option, but implement the level 2 -fmerge-all-constants, which is non-conforming ("Languages like C or C++
require each variable, including multiple instances of the same variable
in recursive calls, to have distinct locations, so using this option
results in non-conforming behavior.").

With (-fmerge-constants or -fmerge-all-constants) & -fdata-sections, string literals are placed in .rodata.xxx.str1.1
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=192#c16
This is, however, suboptimal because the cost of a section header
(sizeof(Elf64_Shdr)=64) + a section name (".rodata.xxx.str1.1") is quite large.
I have replied on https://gcc.gnu.org/bugzilla/show_bug.cgi?id=192#c19 and
created a GNU ld feature request
(https://sourceware.org/bugzilla/show_bug.cgi?id=26622)


More information about the llvm-dev mailing list