[llvm-dev] [ELF] String literals don't obey -fdata-sections

Gaƫl Jobin via llvm-dev llvm-dev at lists.llvm.org
Wed Sep 16 06:14:06 PDT 2020


On 2020-09-16 00:18, Fangrui Song wrote:

> Usually it is because nobody has noticed the problem or nobody is
> motivated enough to fix the problems, not that they intentionally leave
> a problem open:) I took some time to look at the problem and conclude
> that clang should do nothing on this. Actually, with the clang behavior,
> you can discard "Unused" if you use LLD. Read on.

Sorry if I misspoke, I was not suggesting that the bug was known and
voluntary not fixed by laziness ;-). I am sure there is a valid reason
and wanted to know about it. Just like you explained, it appears that
LLVM rely on LLD to do that instead of enforcing it in the middle-end
which is a different approach to GCC. 

> In GCC, -O turns on -fmerge-constants. Clang does not implement this
> option, but implement the level 2 -fmerge-all-constants, which is non-conforming ("Languages like C or C++
> require each variable, including multiple instances of the same variable
> in recursive calls, to have distinct locations, so using this option
> results in non-conforming behavior.").

Non-confirming in the sense of C/C++ standard? How is it related to the
-fdata-sections implementation? 

> With (-fmerge-constants or -fmerge-all-constants) & -fdata-sections, string literals are placed in .rodata.xxx.str1.1
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=192#c16
> This is, however, suboptimal because the cost of a section header
> (sizeof(Elf64_Shdr)=64) + a section name (".rodata.xxx.str1.1") is quite large.
> I have replied on https://gcc.gnu.org/bugzilla/show_bug.cgi?id=192#c19 and
> created a GNU ld feature request
> (https://sourceware.org/bugzilla/show_bug.cgi?id=26622)

In my example, LLVM/Clang already put both pointer "test" and "unused"
in different data section because of "-fdata-sections" as seen below.

> ; Segment unnamed segment
> ; Range: [0x5c; 0x64[ (8 bytes)
> ; File offset : [144; 152[ (8 bytes)
> ; Permissions:  - 
> 
> ; Section .data.test
> ; Range: [0x5c; 0x60[ (4 bytes)
> ; File offset : [144; 148[ (4 bytes)
> ; Flags: 0x3
> ;   SHT_PROGBITS
> ;   SHF_WRITE
> ;   SHF_ALLOC 
> 
> test: 
> 
> 0000005c         dd         0x00000063 
> 
> ; Section .data.unused
> ; Range: [0x60; 0x64[ (4 bytes)
> ; File offset : [148; 153[ (4 bytes)
> ; Flags: 0x3
> ;   SHT_PROGBITS
> ;   SHF_WRITE
> ;   SHF_ALLOC 
> 
> unused: 
> 
> 00000060         dw        0x00000070

So I am not sure to understand the point about sub-optimality here since
it is already the case for the .data section where each variable imply a
suboptimal cost in term of section header. How the c-string like datas
are different ? I mean, the concept of -fdata-section/-ffunction-section
("one section for each data/functions") should be the same for every
kind of data, no?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200916/62a00cd9/attachment-0001.html>


More information about the llvm-dev mailing list