[PATCH] D58426: llvm-objcopy: Change sectionWithinSegment() to use virtual addresses instead of file offsets.

Jake Ehrlich via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Mar 19 12:51:53 PDT 2019


jakehehrlich added a comment.

I contest that this change is correct. This predicate is specifically designed to capture which bytes overlap in the file, not the run time semantic goal you're looking for. I'm also concerned that it will interact badly with PT_TLS, . It was for this exact reason (handling of PT_TLS) that we chose to use Offset and Filesize *instead* of VAddr and MemSize as was my first instinct.

Consider the following not-quote-counter example that I pulled from the first binary I found in fuchsia that uses TLS. You can make it into a counter example by simply adding a giant symbol to .tbss

  [19] .tdata            PROGBITS         00000000001ae000  001ae000
       0000000000000148  0000000000000000 WAT       0     0     8
  [20] .tbss             NOBITS           00000000001ae148  001ae148
       00000000000008f8  0000000000000000 WAT       0     0     8
  [21] .data.rel.ro      PROGBITS         00000000001ae148  001ae148
       0000000000006ff0  0000000000000000  WA       0     0     8
  [22] .got              PROGBITS         00000000001b5138  001b5138
       0000000000000258  0000000000000000  WA       0     0     8
  [23] .bss              NOBITS           00000000001b6000  001b5390
       0000000000000210  0000000000000000  WA       0     0     8
  
  LOAD           0x00000000001ad000 0x00000000001ad000 0x00000000001ad000
                 0x0000000000008390 0x0000000000009210  RW     0x1000
  TLS            0x00000000001ae000 0x00000000001ae000 0x00000000001ae000
                 0x0000000000000148 0x0000000000000a40  R      0x8

By your predicate .data.rel.ro is *not* in the PT_TLS segment but it if .tbss had been larger your predicate *would* say that .data.rel.ro is in PT_TLS. That is incorrect behavior and this is the reason that we very deliberately used Offset and FileSize and not VAddr and MemSize.

Bugs of this form are mitigated by trying to find the "most parental" segment but the heuristic used there is to favor the segment with the smallest offset. If a PT_TLS segment was in the program header table first, had the same offset as the encompassing PT_LOAD (e.g. .data and .got.plt was empty), and had there was a large symbol defined in .tbss you're predicate would cause layout to fail.

The offset of NOBITS sections shouldn't matter but we ofcourse do (via ParentSegment) use this information in stripping now. This accounts for cases where we want to avoid stripping a section that isn't allocated but is in a segment which can for instance happen with some PT_NOTE segments in some questionable cases where the PT_NOTE segment is added after static linking occurs. There might be a bug lurking in situations where NOBITS have unconventional (but still correct) offsets and we're now not stripping them correctly but that's a recent occurrence and a different bug.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D58426/new/

https://reviews.llvm.org/D58426





More information about the llvm-commits mailing list