[llvm] [llvm-objcopy] Fix file offsets when PT_INTERP/PT_LOAD offsets are equal (PR #80562)

via llvm-commits llvm-commits at lists.llvm.org
Sat Feb 3 12:16:09 PST 2024


llvmbot wrote:


<!--LLVM PR SUMMARY COMMENT-->

@llvm/pr-subscribers-llvm-binary-utilities

Author: Fangrui Song (MaskRay)

<details>
<summary>Changes</summary>

(#<!-- -->79887) When the offset of a PT_INTERP segment equals the offset of a
PT_LOAD segment, we consider that the parent of the PT_LOAD segment is
the PT_INTERP segment. In `layoutSegments`, we place both segments to be
after the current `Offset`, ignoring the PT_LOAD alignment.

This scenario is possible with fixed section addresses, but doesn't
happen with default linker layouts (.interp precedes other sections and
is part of a PT_LOAD segment containing the ELF header and program
headers).

```
% cat a.s
.globl _start; _start: ret
.rodata; .byte 0
.tdata; .balign 4096; .byte 0
% clang -fuse-ld=lld a.s -o a -nostdlib -no-pie -z separate-loadable-segments -Wl,-Ttext=0x201000,--section-start=.interp=0x202000,--section-start=.rodata=0x202020,-z,nognustack
% llvm-objcopy a a2
% llvm-readelf -l a2   # incorrect offset(PT_LOAD)
  Type           Offset   VirtAddr           PhysAddr           FileSiz  MemSiz   Flg Align
  PHDR           0x000040 0x0000000000200040 0x0000000000200040 0x0001c0 0x0001c0 R   0x8
  INTERP         0x001001 0x0000000000202000 0x0000000000202000 0x00001c 0x00001c R   0x1
      [Requesting program interpreter: /lib64/ld-linux-x86-64.so.2]
  LOAD           0x000000 0x0000000000200000 0x0000000000200000 0x000200 0x000200 R   0x1000
  LOAD           0x001000 0x0000000000201000 0x0000000000201000 0x000001 0x000001 R E 0x1000
//// incorrect offset
  LOAD           0x001001 0x0000000000202000 0x0000000000202000 0x000021 0x000021 R   0x1000
  LOAD           0x002000 0x0000000000203000 0x0000000000203000 0x000001 0x001000 RW  0x1000
  TLS            0x002000 0x0000000000203000 0x0000000000203000 0x000001 0x000001 R   0x1000
  GNU_RELRO      0x002000 0x0000000000203000 0x0000000000203000 0x000001 0x001000 R   0x1000
```

The same issue occurs for PT_TLS/PT_GNU_RELRO if we PT_TLS's alignment
is smaller and we place the PT_LOAD after PT_TLS/PT_GNU_RELRO segments
(not linker default, but possible with a `PHDRS` linker script command).

Fix #<!-- -->79887: set the parent of the PT_INTERP segment to the PT_LOAD
segment when their addresses are equal.


---
Full diff: https://github.com/llvm/llvm-project/pull/80562.diff


2 Files Affected:

- (modified) llvm/lib/ObjCopy/ELF/ELFObject.cpp (+7) 
- (added) llvm/test/tools/llvm-objcopy/ELF/non-load-at-load-start.test (+138) 


``````````diff
diff --git a/llvm/lib/ObjCopy/ELF/ELFObject.cpp b/llvm/lib/ObjCopy/ELF/ELFObject.cpp
index c8b66d6fcb5eb..edf0b53faf96d 100644
--- a/llvm/lib/ObjCopy/ELF/ELFObject.cpp
+++ b/llvm/lib/ObjCopy/ELF/ELFObject.cpp
@@ -1234,6 +1234,13 @@ static bool compareSegmentsByOffset(const Segment *A, const Segment *B) {
     return true;
   if (A->OriginalOffset > B->OriginalOffset)
     return false;
+  // If one is PT_LOAD and the other isn't (e.g. PT_INTERP, PT_GNU_RELRO,
+  // PT_TLS), order the PT_LOAD first to ensure ParentSegment relationship will
+  // be correct.
+  bool AIsLOAD = A->Type == PT_LOAD;
+  bool BIsLOAD = B->Type == PT_LOAD;
+  if (AIsLOAD != BIsLOAD)
+    return AIsLOAD;
   return A->Index < B->Index;
 }
 
diff --git a/llvm/test/tools/llvm-objcopy/ELF/non-load-at-load-start.test b/llvm/test/tools/llvm-objcopy/ELF/non-load-at-load-start.test
new file mode 100644
index 0000000000000..7551b6473379f
--- /dev/null
+++ b/llvm/test/tools/llvm-objcopy/ELF/non-load-at-load-start.test
@@ -0,0 +1,138 @@
+## When the offset of a non-PT_LOAD segment (e.g. PT_INTERP) equals the offset
+## of a PT_LOAD segment, set the parent of the PT_INTERP segment to the PT_LOAD
+## segment, ensuring that the offset is correctly aligned.
+
+# RUN: yaml2obj %s -o %t
+# RUN: llvm-objcopy %t %t2
+# RUN: llvm-readelf -Sl %t2 | FileCheck %s
+
+# CHECK:       [Nr] Name              Type            Address          Off    Size   ES Flg Lk Inf Al
+# CHECK-NEXT:  [ 0]                   NULL            0000000000000000 000000 000000 00      0   0  0
+# CHECK-NEXT:  [ 1] .text             PROGBITS        0000000000201000 001000 000001 00  AX  0   0  4
+# CHECK-NEXT:  [ 2] .interp           PROGBITS        0000000000202000 002000 00001c 00   A  0   0  1
+# CHECK-NEXT:  [ 3] .rodata           PROGBITS        0000000000202020 002020 000001 00   A  0   0  1
+# CHECK-NEXT:  [ 4] .tdata            PROGBITS        0000000000203000 003000 000001 00 WAT  0   0 4096
+# CHECK-NEXT:  [ 5] .relro_padding    NOBITS          0000000000203001 003001 000fff 00  WA  0   0  1
+# CHECK-NEXT:  [ 6] .symtab           SYMTAB          0000000000000000 003008 000030 18      7   1  8
+# CHECK-NEXT:  [ 7] .strtab           STRTAB          0000000000000000 003038 000008 00      0   0  1
+# CHECK-NEXT:  [ 8] .shstrtab         STRTAB          0000000000000000 003040 000047 00      0   0  1
+
+# CHECK:     Program Headers:
+# CHECK-NEXT:  Type           Offset   VirtAddr           PhysAddr           FileSiz  MemSiz   Flg Align
+# CHECK-NEXT:  PHDR           0x000040 0x0000000000200040 0x0000000000200040 0x0001c0 0x0001c0 R   0x8
+# CHECK-NEXT:  INTERP         0x002000 0x0000000000202000 0x0000000000202000 0x00001c 0x00001c R   0x1
+# CHECK-NEXT:      [Requesting program interpreter: /lib64/ld-linux-x86-64.so.2]
+# CHECK-NEXT:  LOAD           0x000000 0x0000000000200000 0x0000000000200000 0x000200 0x000200 R   0x1000
+# CHECK-NEXT:  LOAD           0x001000 0x0000000000201000 0x0000000000201000 0x000001 0x000001 R E 0x1000
+# CHECK-NEXT:  LOAD           0x002000 0x0000000000202000 0x0000000000202000 0x000021 0x000021 R   0x1000
+# CHECK-NEXT:  TLS            0x003000 0x0000000000203000 0x0000000000203000 0x000001 0x001000 R   0x1000
+# CHECK-NEXT:  GNU_RELRO      0x003000 0x0000000000203000 0x0000000000203000 0x000001 0x001000 R   0x1000
+# CHECK-NEXT:  LOAD           0x003000 0x0000000000203000 0x0000000000203000 0x000001 0x001000 RW  0x1000
+
+--- !ELF
+FileHeader:
+  Class:           ELFCLASS64
+  Data:            ELFDATA2LSB
+  Type:            ET_EXEC
+  Machine:         EM_X86_64
+  Entry:           0x201000
+ProgramHeaders:
+  - Type:            PT_PHDR
+    Flags:           [ PF_R ]
+    VAddr:           0x200040
+    Align:           0x8
+    Offset:          0x40
+    FileSize:        0x1c0
+    MemSize:         0x1c0
+  - Type:            PT_INTERP
+    Flags:           [ PF_R ]
+    FirstSec:        .interp
+    LastSec:         .interp
+    ## The address equals the address of its containing PT_LOAD.
+    VAddr:           0x202000
+    Offset:          0x2000
+  - Type:            PT_LOAD
+    Flags:           [ PF_R ]
+    VAddr:           0x200000
+    Align:           0x1000
+    Offset:          0x0
+    FileSize:        0x200
+    MemSize:         0x200
+  - Type:            PT_LOAD
+    Flags:           [ PF_X, PF_R ]
+    FirstSec:        .text
+    LastSec:         .text
+    VAddr:           0x201000
+    Align:           0x1000
+    Offset:          0x1000
+  - Type:            PT_LOAD
+    Flags:           [ PF_R ]
+    FirstSec:        .interp
+    LastSec:         .rodata
+    VAddr:           0x202000
+    Align:           0x1000
+    Offset:          0x2000
+  ## Intentionally place PT_TLS/PT_GNU_RELRO before PT_LOAD to test that we
+  ## correctly set parent segments.
+  - Type:            PT_TLS
+    Flags:           [ PF_R ]
+    FirstSec:        .tdata
+    LastSec:         .relro_padding
+    VAddr:           0x203000
+    Align:           0x1000
+    Offset:          0x3000
+  - Type:            PT_GNU_RELRO
+    Flags:           [ PF_R ]
+    FirstSec:        .tdata
+    LastSec:         .relro_padding
+    VAddr:           0x203000
+    Align:           0x1000
+    Offset:          0x3000
+  - Type:            PT_LOAD
+    Flags:           [ PF_W, PF_R ]
+    FirstSec:        .tdata
+    LastSec:         .relro_padding
+    VAddr:           0x203000
+    Align:           0x1000
+    Offset:          0x3000
+Sections:
+  - Name:            .text
+    Type:            SHT_PROGBITS
+    Flags:           [ SHF_ALLOC, SHF_EXECINSTR ]
+    Address:         0x201000
+    AddressAlign:    0x4
+    Offset:          0x1000
+    Content:         C3
+  - Name:            .interp
+    Type:            SHT_PROGBITS
+    Flags:           [ SHF_ALLOC ]
+    Address:         0x202000
+    AddressAlign:    0x1
+    Offset:          0x2000
+    Content:         2F6C696236342F6C642D6C696E75782D7838362D36342E736F2E3200
+  - Name:            .rodata
+    Type:            SHT_PROGBITS
+    Flags:           [ SHF_ALLOC ]
+    Address:         0x202020
+    AddressAlign:    0x1
+    Offset:          0x2020
+    Content:         '00'
+  - Name:            .tdata
+    Type:            SHT_PROGBITS
+    Flags:           [ SHF_WRITE, SHF_ALLOC, SHF_TLS ]
+    Address:         0x203000
+    AddressAlign:    0x1000
+    Offset:          0x3000
+    Content:         '00'
+  - Name:            .relro_padding
+    Type:            SHT_NOBITS
+    Flags:           [ SHF_WRITE, SHF_ALLOC ]
+    Address:         0x203001
+    AddressAlign:    0x1
+    Size:            0xFFF
+Symbols:
+  - Name:            _start
+    Section:         .text
+    Binding:         STB_GLOBAL
+    Value:           0x201000
+...

``````````

</details>


https://github.com/llvm/llvm-project/pull/80562


More information about the llvm-commits mailing list