[PATCH] D58426: llvm-objcopy: Change sectionWithinSegment() to use virtual addresses instead of file offsets.
Jake Ehrlich via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Mar 19 12:51:53 PDT 2019
jakehehrlich added a comment.
I contest that this change is correct. This predicate is specifically designed to capture which bytes overlap in the file, not the run time semantic goal you're looking for. I'm also concerned that it will interact badly with PT_TLS, . It was for this exact reason (handling of PT_TLS) that we chose to use Offset and Filesize *instead* of VAddr and MemSize as was my first instinct.
Consider the following not-quote-counter example that I pulled from the first binary I found in fuchsia that uses TLS. You can make it into a counter example by simply adding a giant symbol to .tbss
[19] .tdata PROGBITS 00000000001ae000 001ae000
0000000000000148 0000000000000000 WAT 0 0 8
[20] .tbss NOBITS 00000000001ae148 001ae148
00000000000008f8 0000000000000000 WAT 0 0 8
[21] .data.rel.ro PROGBITS 00000000001ae148 001ae148
0000000000006ff0 0000000000000000 WA 0 0 8
[22] .got PROGBITS 00000000001b5138 001b5138
0000000000000258 0000000000000000 WA 0 0 8
[23] .bss NOBITS 00000000001b6000 001b5390
0000000000000210 0000000000000000 WA 0 0 8
LOAD 0x00000000001ad000 0x00000000001ad000 0x00000000001ad000
0x0000000000008390 0x0000000000009210 RW 0x1000
TLS 0x00000000001ae000 0x00000000001ae000 0x00000000001ae000
0x0000000000000148 0x0000000000000a40 R 0x8
By your predicate .data.rel.ro is *not* in the PT_TLS segment but it if .tbss had been larger your predicate *would* say that .data.rel.ro is in PT_TLS. That is incorrect behavior and this is the reason that we very deliberately used Offset and FileSize and not VAddr and MemSize.
Bugs of this form are mitigated by trying to find the "most parental" segment but the heuristic used there is to favor the segment with the smallest offset. If a PT_TLS segment was in the program header table first, had the same offset as the encompassing PT_LOAD (e.g. .data and .got.plt was empty), and had there was a large symbol defined in .tbss you're predicate would cause layout to fail.
The offset of NOBITS sections shouldn't matter but we ofcourse do (via ParentSegment) use this information in stripping now. This accounts for cases where we want to avoid stripping a section that isn't allocated but is in a segment which can for instance happen with some PT_NOTE segments in some questionable cases where the PT_NOTE segment is added after static linking occurs. There might be a bug lurking in situations where NOBITS have unconventional (but still correct) offsets and we're now not stripping them correctly but that's a recent occurrence and a different bug.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D58426/new/
https://reviews.llvm.org/D58426
More information about the llvm-commits
mailing list