[lld] [ELF] Orphan placement: remove hasInputSections condition (PR #93761)

Fangrui Song via llvm-commits llvm-commits at lists.llvm.org
Mon Jun 10 14:20:29 PDT 2024


MaskRay wrote:

> > I think the new behavior is expected. Use the example at #93761 (comment). Given
> 
> No other linkers put .dynsym and friends into an executable segment. Why do you find this acceptable?

```
PHDRS { text PT_LOAD; rodata PT_LOAD; }
SECTIONS {
  .text1 : { *(.text1) } : text
  .other : { LONG(0); *(.other) }
  .text2 : { *(.text2) }
  .rodata1 : { *(.rodata) } : rodata
}
```

(I use `.rodata1` instead of `.rodata` to get away with GNU ld's special behavior regarding certain section names.)

GNU ld's rank system does not differentiate .dynsym and PROGBITS read-only sections.
Whether `.rodata1` is present determines whether .dynsym will be placed after .text2 or .rodata1.
It doesn't have a condition resembling our `hasInputSection`.

lld has a more fine-grained rank system and prefers to place .dynsym after the first read-only PROGBITS section, when PHDRS/MEMORY is specified.
The script is underspecified. When .rodata1 is present, .dynsym is previously placed after .rodata1.
While some might prefer this behavior, I don't think the script provides enough signal, and this is not justification to retain the `hasInputSection` condition.

```
% ld.bfd -shared test.o -T test.t -o test.so && readelf -WSl test.so
...
  [Nr] Name              Type            Address          Off    Size   ES Flg Lk Inf Al
  [ 0]                   NULL            0000000000000000 000000 000000 00      0   0  0
  [ 1] .text1            PROGBITS        0000000000000000 001000 000001 00  AX  0   0  1
  [ 2] .other            PROGBITS        0000000000000001 001001 000004 00   A  0   0  1
  [ 3] .text2            PROGBITS        0000000000000005 001005 000001 00  AX  0   0  1
  [ 4] .rodata1          PROGBITS        0000000000000006 001006 000004 00   A  0   0  1
## GNU ld places .orphan and .dynsym after .rodata1
  [ 5] .orphan           PROGBITS        000000000000000a 00100a 000004 00   A  0   0  1
  [ 6] .dynsym           DYNSYM          0000000000000010 001010 000018 18   A  7   1  8
  [ 7] .dynstr           STRTAB          0000000000000028 001028 000001 00   A  0   0  1
  [ 8] .hash             HASH            0000000000000030 001030 000010 04   A  6   0  8
  [ 9] .gnu.hash         GNU_HASH        0000000000000040 001040 00001c 00   A  6   0  8
  [10] .dynamic          DYNAMIC         0000000000000060 001060 0000c0 10  WA  7   0  8
...
% oldld.lld -shared test.o -T test.t -o test.so && readelf -WSl test.so
...
  [Nr] Name              Type            Address          Off    Size   ES Flg Lk Inf Al
  [ 0]                   NULL            0000000000000000 000000 000000 00      0   0  0
  [ 1] .text1            PROGBITS        0000000000000000 001000 000001 00  AX  0   0  1
  [ 2] .other            PROGBITS        0000000000000001 001001 000004 00   A  0   0  1
## .orphan and .dynsym are not placed here because .other has no input section. If .other has a read-only input section, .orphan and .dynsym would be placed here.
  [ 3] .text2            PROGBITS        0000000000000005 001005 000001 00  AX  0   0  1
  [ 4] .text             PROGBITS        0000000000000008 001008 000000 00  AX  0   0  4
  [ 5] .rodata1          PROGBITS        0000000000000008 001008 000004 00   A  0   0  1
  [ 6] .orphan           PROGBITS        000000000000000c 00100c 000004 00   A  0   0  1
  [ 7] .dynsym           DYNSYM          0000000000000010 001010 000018 18   A 10   1  8
  [ 8] .gnu.hash         GNU_HASH        0000000000000028 001028 00001c 00   A  7   0  8
  [ 9] .hash             HASH            0000000000000044 001044 000010 04   A  7   0  4
  [10] .dynstr           STRTAB          0000000000000054 001054 000001 00   A  0   0  1
  [11] .dynamic          DYNAMIC         0000000000000058 001058 000070 10  WA 10   0  8
...
% newld.lld -shared test.o -T test.t -o test.so && readelf -WSl test.so
...
  [Nr] Name              Type            Address          Off    Size   ES Flg Lk Inf Al
  [ 0]                   NULL            0000000000000000 000000 000000 00      0   0  0
  [ 1] .text1            PROGBITS        0000000000000000 001000 000001 00  AX  0   0  1
  [ 2] .other            PROGBITS        0000000000000001 001001 000004 00   A  0   0  1
## .dynsym (whose sortRank is smaller than .other's) is placed here, regardless of whether or not .other has a read-only input section.
  [ 3] .dynsym           DYNSYM          0000000000000008 001008 000018 18   A  6   1  8
  [ 4] .gnu.hash         GNU_HASH        0000000000000020 001020 00001c 00   A  3   0  8
  [ 5] .hash             HASH            000000000000003c 00103c 000010 04   A  3   0  4
  [ 6] .dynstr           STRTAB          000000000000004c 00104c 000001 00   A  0   0  1
  [ 7] .text2            PROGBITS        000000000000004d 00104d 000001 00  AX  0   0  1
  [ 8] .text             PROGBITS        0000000000000050 001050 000000 00  AX  0   0  4
  [ 9] .rodata1          PROGBITS        0000000000000050 001050 000004 00   A  0   0  1
  [10] .orphan           PROGBITS        0000000000000054 001054 000004 00   A  0   0  1
  [11] .dynamic          DYNAMIC         0000000000000058 001058 000070 10  WA  6   0  8
...
```

> Moreover, if we change the linker script, following the idea of lld to put read-only sections before executable ones:
> 
> [...]
> As you can see, the problem here is exactly the same.

What's the problem? `.other` has a rank of read-only sections. `.orphan` has the same rank and is placed after `.other` with the simplified condition.


https://github.com/llvm/llvm-project/pull/93761


More information about the llvm-commits mailing list