[PATCH] D79926: [lld-macho] Support .subsections_via_symbols

Shoaib Meenai via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri May 15 17:24:53 PDT 2020


smeenai requested changes to this revision.
smeenai added a comment.
This revision now requires changes to proceed.

Looks really good! Putting this back on your queue for some minor comments and questions.



================
Comment at: lld/MachO/InputFiles.cpp:146
     isec->flags = sec.flags;
-    ret.push_back(isec);
+    subsections.push_back({{0, isec}});
   }
----------------
Nit: would `emplace_back` let us get rid of one pair of curlies?


================
Comment at: lld/MachO/InputFiles.cpp:153
+//
+// \p offset should describe an offset relative to the start of the original
+// InputSection (before any subsection splitting has occurred). It will be
----------------
I think `\p` only makes sense for Doxygen-style comments (which would have triple slashes)? I'm not very well versed with this though.


================
Comment at: lld/MachO/InputFiles.cpp:157
+// the containing subsection.
+InputSection *findContainingSubsection(SubsectionMap &map, uint32_t *offset) {
+  auto it = std::prev(map.upper_bound(*offset));
----------------
Nit: make this `static`


================
Comment at: lld/MachO/InputFiles.cpp:199
+      // TODO: Figure out what to do for non-pcrel section relocations.
+      // TODO: The offset of 4 is probably not right for ARM64.
+      uint32_t targetOffset =
----------------
I know you're just moving this code here, but we should also add a TODO that the offset of 4 only works for relocations with a length of 4 bytes (i.e. a length field of 2).


================
Comment at: lld/MachO/InputFiles.cpp:214
+                             bool subsectionsViaSymbols) {
+  // resize(), not reserve(), because we are going to create N_ALT_ENTRY symbols
+  // out-of-sequence.
----------------
Thanks for the comment!


================
Comment at: lld/MachO/InputFiles.cpp:252
+
+    // We can't create alt-entry symbols at this point because a later symbol
+    // may split its section, which may affect which subsection the alt-entry
----------------
The implication is that the nList entries aren't (necessarily) in address order, right? Otherwise we'd know the final subsection for an alt-entry symbol by the time we reached it.


================
Comment at: lld/MachO/InputFiles.cpp:278
+    firstIsec->data = firstIsec->data.slice(0, firstSize);
+    secondIsec->align = MinAlign(firstIsec->align, offset);
+
----------------
What's the reason for this?


================
Comment at: lld/MachO/InputSection.cpp:46
 
     uint64_t val = va + addend;
     if (1) // TODO: handle non-pcrel relocations
----------------
Nit: now that we're not adjusting `addend` anywhere, we can just get rid of the variable and use `r.addend` directly everywhere, right?


================
Comment at: lld/test/MachO/subsections-section-relocs.s:9
+# RUN: lld -flavor darwinnew -o %t/test %t/test.o -order_file %t/order-file
+# RUN: llvm-objdump --section-headers -d --no-show-raw-insn %t/test | FileCheck %s
+# CHECK-LABEL: Sections:
----------------
It'd be nice to dump the string section and confirm the ordering is correct (to make sure we're setting subsection contents directly).


================
Comment at: lld/test/MachO/subsections-section-relocs.s:14
+# CHECK:       [[#%x, ADDR:]]: leaq
+# CHECK-SAME:    [[#%u, CSTRING_ADDR - ADDR - 3]](%rip), %rsi {{.*}} <_bar_str+0x4>
+
----------------
Could you add a comment explaining the ` - 3`? I believe it's because `L_.str` should end up at `CSTRING_ADDR + 4`, and the `leaq` instruction is 7 bytes long so the relocation is against `ADDR + 7`, so you get `CSTRING_ADDR + 4 - (ADDR + 7)`, but that took a bit of working out :)


================
Comment at: lld/test/MachO/subsections-section-relocs.s:32
+## References to this generate a section relocation
+## N.B.: ld64 doesn't actually reorder symbols in __cstring based on the order
+##       file. Only our implementation does. However, I'm not sure how else to
----------------
Hmm, how can it emit a symbol relocation against a private symbol? Or does it convert it into a non-private symbol?

In general, ld64 atomizes cstring sections by NUL delimiters instead of by symbols, so we'll need to adjust things if we go down that route. This is good for now though.


================
Comment at: lld/test/MachO/subsections-symbol-relocs.s:28
+# CHECK-EMPTY:
+# CHECK-NEXT:  <_qux>:
+# CHECK:       <_foo>:
----------------
It'd be good to verify the rest of the instructions as well, so we can confirm the subsection contents are correct.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D79926/new/

https://reviews.llvm.org/D79926





More information about the llvm-commits mailing list