[PATCH] D81784: [ELF] Resolve relocations in .debug_* referencing (discarded symbols or ICF folded section symbols) to tombstone values

David Blaikie via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Jun 18 17:31:44 PDT 2020


dblaikie accepted this revision.
dblaikie added a comment.

Some optional code layout and comments - but all good either way/whatever you find most readable/helpful.



================
Comment at: lld/ELF/InputSection.cpp:908
 
-    if (sym.isTls() && !Out::tlsPhdr)
+    auto *ds = dyn_cast<Defined>(&sym);
+    if (sym.isTls() && !Out::tlsPhdr) {
----------------
Probably not a big deal, but curious to do this test when it might not be needed. Ah, partly because fo the if/else if/else structure - if this used early-return instead, perhaps it would be easier to write it like this:

```
if (sym.isTls() && !Out::tlsPhdr) {
  target->relocateNoSym(bufLoc, type, 0);
  continue;
}
if (isDebug && type == target->symbolicRel) {
  auto *ds = dyn_cast<Defined>(&sym);
  if (!sym.getOutputSection() ||
        (ds && ds->section->repl != ds->section)) {
    target->relocateNoSym(bufLoc, type, isDebugLocOrRanges ? UINT64_MAX - 1 : UINT64_MAX);
    continue;
  }
}
target->relocateNoSym(bufLoc, type,
                            SignExtend64<bits>(sym.getVA(addend)));


================
Comment at: lld/ELF/InputSection.cpp:915-919
+      // folded section symbols) to a tombstone value. The normal behavior
+      // (resolving to addend) is unsatisfactory because the result address
+      // range may collide with a valid range of low address, or leave multiple
+      // CUs claiming ownership of the same range of code, which may confuse
+      // consumers.
----------------
"normal behavior" might be overstating it - it was the behavior of lld and is the behavior of gold, but isn't the behavior of bfd.ld.

maybe "Other options such as zero (1 for debug_ranges) (bfd.ld) or zero+addend (gold) result in overlaps with valid low-address ranges, or deeper problems terminating debug_loc and debug_range lists early"?




================
Comment at: lld/ELF/InputSection.cpp:922-924
+      // .debug_* sections. We have to ignore the addend because we don't want
+      // to resolve DW_AT_low_pc (which may have a non-zero addend) to -1+addend
+      // (wrap around to a low address).
----------------
I think you can omit the "DW_AT_low_pc (which may have a non-zero addend)" part of this, perhaps? In general we don't want -1+addend to wrap around into a valid address range, no matter which attribute it might be on.


================
Comment at: lld/ELF/InputSection.cpp:931-932
+      //
+      // For pre-DWARF-v5 .debug_loc and .debug_ranges, a pair of 0 (empty
+      // function) can terminate the entries, -1 is a reserved value (base
+      // address selection entry). Use -2 instead.
----------------
Might clarify "a pair of 0 (0+addend could cause this when handling an empty function)" (though perhaps that deosn't need to be called out here - since we already motivated this earlier in the comment talking about low or zero addresses - so only the -1 case needs to be justified/explained why it can't use the same solution as all the other .debug_ sections being handled here)

Probably "the entries and -1" rather than a comma there - since there's only two things in this list. (& maybe the end of this "Use -2 instead" might not stand on its own as a sentence - "... so -2 is used."


================
Comment at: lld/test/ELF/debug-dead-reloc.s:37
+
+## .text.3 from The second %t1.o is discarded. Resolved to UINT64_C(-1).
+.section .debug_addr
----------------
So even if the comdat group is identical, this will resolve only one of them & use -1 for the other?

I think that's a regression/separate change that probably should have a sepraate discussion compared to the -1 rather than 0+addend change.

Seems there could be some value in resolving all the function descriptions to the same code where possible - but I also see the problem with having CUs ranges overlap (which I think @probinson is especially concerned about). So the change may well be merited, but I think deserves a separate review/discussion.

(eg: currently... oh, bfd and gold currently do the "point all the DWARF to the one definition" but it seems lld doesn't do that and, as you say, already does zero (well, zero+addend) out the duplicate references to the same deduplicated comdat) Fair enough then... curious, but not a change in behavior, so guess we slipped that in somewhere along the way & hasn't broken anything that I've heard.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D81784/new/

https://reviews.llvm.org/D81784





More information about the llvm-commits mailing list