[llvm] d28c6d5 - [llvm-objcopy][ELF] -O binary: use LMA instead of sh_offset to decide where to write section contents

Fangrui Song via llvm-commits llvm-commits at lists.llvm.org
Sun Dec 15 21:47:48 PST 2019


Author: Fangrui Song
Date: 2019-12-15T21:45:25-08:00
New Revision: d28c6d51d1547d9cd7cd5b7e36b4c03f38ef7c67

URL: https://github.com/llvm/llvm-project/commit/d28c6d51d1547d9cd7cd5b7e36b4c03f38ef7c67
DIFF: https://github.com/llvm/llvm-project/commit/d28c6d51d1547d9cd7cd5b7e36b4c03f38ef7c67.diff

LOG: [llvm-objcopy][ELF] -O binary: use LMA instead of sh_offset to decide where to write section contents

.text sh_address=0x1000 sh_offset=0x1000
.data sh_address=0x3000 sh_offset=0x2000

In an objcopy -O binary output, the distance between two sections equal
their LMA differences (0x3000-0x1000), instead of their sh_offset
differences (0x2000-0x1000). This patch changes our behavior to match
GNU.

This rule gets more complex when the containing PT_LOAD has
p_vaddr!=p_paddr. GNU objcopy essentially computes
sh_offset-p_offset+p_paddr for each candidate section, and removes the
gap before the first address.

Added tests to binary-paddr.test to catch the compatibility problem.

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D71035

Added: 
    

Modified: 
    llvm/test/tools/llvm-objcopy/ELF/binary-paddr.test
    llvm/tools/llvm-objcopy/ELF/Object.cpp

Removed: 
    


################################################################################
diff  --git a/llvm/test/tools/llvm-objcopy/ELF/binary-paddr.test b/llvm/test/tools/llvm-objcopy/ELF/binary-paddr.test
index 58939e876eab..5ac692dc5f86 100644
--- a/llvm/test/tools/llvm-objcopy/ELF/binary-paddr.test
+++ b/llvm/test/tools/llvm-objcopy/ELF/binary-paddr.test
@@ -1,9 +1,21 @@
-# RUN: yaml2obj %s -o %t
-# RUN: llvm-objcopy -O binary %t %t2
-# RUN: od -t x2 %t2 | FileCheck %s --ignore-case
-# RUN: wc -c < %t2 | FileCheck %s --check-prefix=SIZE
+## The computed LMA of a section in a PT_LOAD equals sh_offset-p_offset+p_paddr.
+## The byte offset 
diff erence between two sections equals the 
diff erence between their LMAs.
 
-!ELF
+## Corollary: if two sections are in the same PT_LOAD, the byte offset
+## 
diff erence equals the 
diff erence between their sh_addr fields.
+
+# RUN: yaml2obj --docnum=1 %s -o %t1
+# RUN: llvm-objcopy -O binary %t1 %t1.out
+# RUN: od -A x -t x2 %t1.out | FileCheck %s --check-prefix=CHECK1 --ignore-case
+# RUN: wc -c %t1.out | FileCheck %s --check-prefix=SIZE1
+
+# CHECK1:      000000 c3c3 c3c3 0000 0000 0000 0000 0000 0000
+# CHECK1-NEXT: 000010 0000 0000 0000 0000 0000 0000 0000 0000
+# CHECK1-NEXT: *
+# CHECK1-NEXT: 001000 3232
+# SIZE1:       4098
+
+--- !ELF
 FileHeader:
   Class:           ELFCLASS64
   Data:            ELFDATA2LSB
@@ -14,32 +26,107 @@ Sections:
     Type:            SHT_PROGBITS
     Flags:           [ SHF_ALLOC, SHF_EXECINSTR ]
     Address:         0x1000
-    AddressAlign:    0x0000000000001000
+    AddressAlign:    0x1000
     Content:         "c3c3c3c3"
   - Name:            .data
     Type:            SHT_PROGBITS
-    Flags:           [ SHF_ALLOC ]
+    Flags:           [ SHF_ALLOC, SHF_WRITE ]
     Address:         0x2000
-    AddressAlign:    0x0000000000001000
+    AddressAlign:    0x1000
     Content:         "3232"
 ProgramHeaders:
   - Type: PT_LOAD
-    Flags: [ PF_X, PF_R ]
-    VAddr: 0x1000
-    PAddr: 0x1000
-    Align: 0x1000
+    Flags: [ PF_R, PF_W ]
     Sections:
       - Section: .text
+      - Section: .data
+
+## The computed LMA of a section not in a PT_LOAD equals its sh_addr.
+
+# RUN: yaml2obj --docnum=2 %s -o %t2
+# RUN: llvm-objcopy -O binary %t2 %t2.out
+# RUN: od -A x -t x2 %t2.out | FileCheck %s --check-prefix=CHECK2 --ignore-case
+# RUN: wc -c %t2.out | FileCheck %s --check-prefix=SIZE2
+
+## The computed LMA of .data is 0x4000. The minimum LMA of all sections is 0x1000.
+## The content of .data will be written at 0x4000-0x1000 = 0x3000.
+# CHECK2:      000000 c3c3 c3c3 0000 0000 0000 0000 0000 0000
+# CHECK2-NEXT: 000010 0000 0000 0000 0000 0000 0000 0000 0000
+# CHECK2-NEXT: *
+# CHECK2-NEXT: 003000 3232
+# SIZE2:       12290
+
+--- !ELF
+FileHeader:
+  Class:           ELFCLASS64
+  Data:            ELFDATA2LSB
+  Type:            ET_EXEC
+  Machine:         EM_X86_64
+Sections:
+  - Name:            .text
+    Type:            SHT_PROGBITS
+    Flags:           [ SHF_ALLOC, SHF_EXECINSTR ]
+    ## Not in a PT_LOAD. LMA = sh_addr = 0x1000.
+    Address:         0x1000
+    AddressAlign:    0x1000
+    Content:         "c3c3c3c3"
+  - Name:            .data
+    Type:            SHT_PROGBITS
+    Flags:           [ SHF_ALLOC, SHF_WRITE ]
+    ## LMA = sh_offset-p_offset+p_paddr = 0x2000-0x2000+0x4000 = 0x4000.
+    Address:         0x2000
+    AddressAlign:    0x1000
+    Content:         "3232"
+ProgramHeaders:
   - Type: PT_LOAD
     Flags: [ PF_R, PF_W ]
     VAddr: 0x2000
+    ## p_vaddr is increased from 0x2000 to 0x4000.
     PAddr: 0x4000
-    Align: 0x1000
     Sections:
       - Section: .data
 
-# CHECK:       0000000 c3c3 c3c3 0000 0000 0000 0000 0000 0000
-# CHECK-NEXT:  0000020 0000 0000 0000 0000 0000 0000 0000 0000
-# CHECK-NEXT:  *
-# CHECK-NEXT:  0030000 3232
-# SIZE:        12290
+## Check that we use sh_offset instead of sh_addr to decide where to write section contents.
+
+# RUN: yaml2obj --docnum=3 %s -o %t3
+# RUN: llvm-objcopy -O binary %t3 %t3.out
+# RUN: od -A x -t x2 %t3.out | FileCheck %s --check-prefix=CHECK3 --ignore-case
+# RUN: wc -c %t3.out | FileCheck %s --check-prefix=SIZE3
+
+## The minimum LMA of all sections is 0x1000.
+## The content of .data will be written at 0x3000-0x1000 = 0x2000.
+# CHECK3:      000000 c3c3 c3c3 0000 0000 0000 0000 0000 0000
+# CHECK3-NEXT: 000010 0000 0000 0000 0000 0000 0000 0000 0000
+# CHECK3-NEXT: *
+# CHECK3-NEXT: 002000 3232
+# SIZE3:       8194
+
+--- !ELF
+FileHeader:
+  Class:           ELFCLASS64
+  Data:            ELFDATA2LSB
+  Type:            ET_EXEC
+  Machine:         EM_X86_64
+Sections:
+  - Name:            .text
+    Type:            SHT_PROGBITS
+    Flags:           [ SHF_ALLOC, SHF_EXECINSTR ]
+    ## Not in a PT_LOAD. LMA = sh_addr = 0x1000.
+    Address:         0x1000
+    AddressAlign:    0x1000
+    Content:         "c3c3c3c3"
+  - Name:            .data
+    Type:            SHT_PROGBITS
+    Flags:           [ SHF_ALLOC, SHF_WRITE ]
+    ## sh_addr is increased from 0x2000 to 0x3000, but it is ignored.
+    ## LMA = sh_offset-p_offset+p_paddr = 0x2000-0x2000+0x3000 = 0x3000.
+    Address:         0x3000
+    AddressAlign:    0x1000
+    Content:         "3232"
+ProgramHeaders:
+  - Type: PT_LOAD
+    Flags: [ PF_R, PF_W ]
+    VAddr: 0x3000
+    PAddr: 0x3000
+    Sections:
+      - Section: .data

diff  --git a/llvm/tools/llvm-objcopy/ELF/Object.cpp b/llvm/tools/llvm-objcopy/ELF/Object.cpp
index a1073fe9dcb8..ad53c75663ec 100644
--- a/llvm/tools/llvm-objcopy/ELF/Object.cpp
+++ b/llvm/tools/llvm-objcopy/ELF/Object.cpp
@@ -2253,38 +2253,28 @@ Error BinaryWriter::finalize() {
       std::unique(std::begin(OrderedSegments), std::end(OrderedSegments));
   OrderedSegments.erase(End, std::end(OrderedSegments));
 
-  uint64_t Offset = 0;
-
-  // Modify the first segment so that there is no gap at the start. This allows
-  // our layout algorithm to proceed as expected while not writing out the gap
-  // at the start.
-  if (!OrderedSegments.empty()) {
-    Segment *Seg = OrderedSegments[0];
-    const SectionBase *Sec = Seg->firstSection();
-    auto Diff = Sec->OriginalOffset - Seg->OriginalOffset;
-    Seg->OriginalOffset += Diff;
-    // The size needs to be shrunk as well.
-    Seg->FileSize -= Diff;
-    // The PAddr needs to be increased to remove the gap before the first
-    // section.
-    Seg->PAddr += Diff;
-    uint64_t LowestPAddr = Seg->PAddr;
-    for (Segment *Segment : OrderedSegments) {
-      Segment->Offset = Segment->PAddr - LowestPAddr;
-      Offset = std::max(Offset, Segment->Offset + Segment->FileSize);
-    }
+  // Compute the section LMA based on its sh_offset and the containing segment's
+  // p_offset and p_paddr. Also compute the minimum LMA of all sections as
+  // MinAddr. In the output, the contents between address 0 and MinAddr will be
+  // skipped.
+  uint64_t MinAddr = UINT64_MAX;
+  for (SectionBase &Sec : Obj.allocSections()) {
+    if (Sec.ParentSegment != nullptr)
+      Sec.Addr =
+          Sec.Offset - Sec.ParentSegment->Offset + Sec.ParentSegment->PAddr;
+    MinAddr = std::min(MinAddr, Sec.Addr);
   }
 
-  layoutSections(Obj.allocSections(), Offset);
-
   // Now that every section has been laid out we just need to compute the total
   // file size. This might not be the same as the offset returned by
   // layoutSections, because we want to truncate the last segment to the end of
   // its last section, to match GNU objcopy's behaviour.
   TotalSize = 0;
-  for (const SectionBase &Sec : Obj.allocSections())
+  for (SectionBase &Sec : Obj.allocSections()) {
+    Sec.Offset = Sec.Addr - MinAddr;
     if (Sec.Type != SHT_NOBITS)
       TotalSize = std::max(TotalSize, Sec.Offset + Sec.Size);
+  }
 
   if (Error E = Buf.allocate(TotalSize))
     return E;


        


More information about the llvm-commits mailing list