[PATCH] D132560: [lld-macho] Add initial support for chained fixups

Nico Weber via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Sep 16 07:26:27 PDT 2022


thakis accepted this revision.
thakis added a comment.
This revision is now accepted and ready to land.

This is excellent.

Apologies for taking so long to take a real look.

A long list of nits below, but nothing substantial.



================
Comment at: lld/MachO/Arch/ARM64Common.h:116
+               pageBits(pointerVA) - pcPageBits);
+  encodePageOff12(&buf32[1], stubCode[1], pointerVA);
   buf32[2] = stubCode[2];
----------------
(this should be

```
  encodePage21(&buf32[0], d, stubCode[0], pageBits(pointerVA) - pcPageBits);
  encodePageOff12(&buf32[1], d, stubCode[1], pointerVA);
```

after rebasing across D133269)


================
Comment at: lld/MachO/Driver.cpp:933
+        {PLATFORM_WATCHOS, VersionTuple(6, 0)},
+        {PLATFORM_BRIDGEOS, VersionTuple(4, 0)}}};
   PlatformType platform = removeSimulator(config->platformInfo.target.Platform);
----------------
BertalanD wrote:
> thakis wrote:
> > (This seems unrelated. I'd land that separately if you want (no review needed), or just revert it. This runs just once, so std::vector is good enough, and in c++20 vector is constexpr anyways. And having to manually repeat the number of elements is a tiny bit inconvenient.)
> I know this change is absolutely insignificant, but this pointless allocation has been bothering me so much...
> 
> I'll land this change separately.
Software is one of the few areas where human brains have to deal with 9+ orders of magnitude within a relatively short timespan: a single bit can be a critical source of a bug, and right after that you might think about how to best organize gigabytes of data. You might try to shave off cycles form a very hot inner loop, or have a function that's 10^6 times as inefficient than it could be and it'll still finished imperceptibly fast as it's only called once.

This is really hard to wrap your brain around.

I get it, superfluous allocations feel silly. But a _single_ superfluous allocation that happens just one during startup is absolutely, completely irrelevant. (This one happens to go away by itself when we adopt C++20 too, but that fact really doesn't matter here.)

Go ahead and change it if you want (in a separate commit), but learning to quiet the part of the brain that screams and shouts "this! is! so! inefficient!" and considering if it actually matters is imho a very useful skill to have.

(In contrast, increasing readability and clarity code on the other hand is almost always worth it.)


================
Comment at: lld/MachO/Driver.cpp:1492
+  config->emitInitOffsets =
+      config->emitChainedFixups || args.hasArg(OPT_init_offsets);
   config->icfLevel = getICFLevel(args);
----------------
Maybe add something like "Like in ld64, enabling chained fixups enables the `-init_offsets` transformation" to commit message.


================
Comment at: lld/MachO/SyntheticSections.cpp:321
+    } else if (const auto *defined = dyn_cast<Defined>(sym)) {
+      if (defined->isExternalWeakDef() || defined->interposable)
+        in.chainedFixups->addBinding(sym, isec, offset, addend);
----------------
nit: This conditional is duplicated in writeChainedFixup. Should it be in a file-local-static `definedNeedsBind()` helper?


================
Comment at: lld/MachO/SyntheticSections.cpp:327
+      llvm_unreachable("cannot bind to an undefined symbol");
+    }
   } else {
----------------
nit: return here, keep non-chained path dedented and don't add an else


================
Comment at: lld/MachO/SyntheticSections.cpp:358
+  assert(config->emitChainedFixups);
+  assert(target->wordSize == 8 && "Only 64-bit platforms are supported");
+  auto *rebase = reinterpret_cast<dyld_chained_ptr_64_rebase *>(buf);
----------------
Maybe `assert(reinterpret_cast<uintptr_t>(buf) % target->wordSize == 0 && "unaligned fixup address")`? (optional)


================
Comment at: lld/MachO/SyntheticSections.cpp:360
+  auto *rebase = reinterpret_cast<dyld_chained_ptr_64_rebase *>(buf);
+  rebase->target = targetVA;
+  rebase->high8 = (targetVA >> 56);
----------------
nit: maybe `& 0xf'ffff'ffff` to make the truncation explicit?


================
Comment at: lld/MachO/SyntheticSections.cpp:376
+  assert(config->emitChainedFixups);
+  assert(target->wordSize == 8 && "Only 64-bit platforms are supported");
+  auto *bind = reinterpret_cast<dyld_chained_ptr_64_bind *>(buf);
----------------
(same)


================
Comment at: lld/MachO/SyntheticSections.cpp:2101
+  return alignTo<8>(sizeof(dyld_chained_starts_in_segment) +
+                    pageStarts.back().first * sizeof(uint16_t));
+}
----------------
Since we always write this section: Can we end up in a situation where we write this section but `pageStarts` is empty (and calling `back()` is UB)?

AFAICT pageStarts is only added to if there actually are fixups, and a tiny program not calling anything might not have any (?)


================
Comment at: lld/MachO/SyntheticSections.cpp:2109
+  segInfo->page_size = target->getPageSize();
+  // FIXME: Use DYLD_CHAINED_PTR_64_OFFSET on newer OS versions.
+  segInfo->pointer_format = DYLD_CHAINED_PTR_64;
----------------
Out of curiosity: What are the implications of (not) doing this?


================
Comment at: lld/MachO/SyntheticSections.cpp:2113
+  segInfo->max_valid_pointer = 0; // not used on 64-bit
+  segInfo->page_count = pageStarts.back().first + 1;
+
----------------
(same question as above)


================
Comment at: lld/MachO/SyntheticSections.cpp:2147
+
+void ChainedFixupsSection::writeTo(uint8_t *buf) const {
+  auto *header = reinterpret_cast<dyld_chained_fixups_header *>(buf);
----------------
Is the on-disk layout of this section described anywhere? Consider putting a short comment at the top of this function that describes the layout, maybe.

Maybe something like:

```
// LC_DYLD_CHAINED_FIXUPS data consists of (in this order):
// * A dyld_chained_fixups_header
// * A dyld_chained_starts_in_image
// * One dyld_chained_starts_in_segment per segment
// * List of all imports (dyld_chained_import, dyld_chained_import_addend, 
     or dyld_chained_import_addend64)
// * Names of imported symbols
```

(Do we have flexibility where to put the symbol name table? Does ld64 put it after the imports?)


================
Comment at: lld/MachO/SyntheticSections.cpp:2149
+  auto *header = reinterpret_cast<dyld_chained_fixups_header *>(buf);
+  header->fixups_version = 0;
+  header->imports_count = bindings.size();
----------------
Maybe add `// This is step 3 of the algorithm described in the class comment of ChainedFixupsSection.`


================
Comment at: lld/MachO/SyntheticSections.cpp:2156
+
+  auto curOffset = [&]() {
+    return reinterpret_cast<uintptr_t>(buf) -
----------------
Explicitly list needed captures instead of saying `&`. `&` capture lists are bug magnets. (In this case it's obviously harmless, but it's imho still a good habit to almost never capture `&`)


================
Comment at: lld/MachO/SyntheticSections.cpp:2168
+  buf += alignTo<8>(sizeof(dyld_chained_starts_in_image) +
+                    outputSegments.size() * sizeof(uint32_t));
+
----------------
What is this hole for? Is it filled elsewhere?

i.e. maybe a comment like "dyld_chained_starts_in_image is followed by an uint32_t for each segment. Leave room for it, and fill it via the `segStarts` pointer. This is extremely obvious once you know it, but it took me a bit to figure out. (Now that I did, it's obvious to me too and I don't know why it took me that long. But it did.)


================
Comment at: lld/MachO/SyntheticSections.cpp:2175
+
+  buf += alignTo<8>(curOffset()) - curOffset();
+
----------------
Cute. But doesn't `buf = alignTo<8>(buf);` do the same thing in slightly less roundabout? :D


================
Comment at: lld/MachO/SyntheticSections.cpp:2185
+    buf += writeImport(buf, importFormat, libOrdinal, sym.isWeakRef(),
+                       nameOffset, import.second);
+    nameOffset += sym.getName().size() + 1;
----------------
nit: Maybe make writeImport return void and do `buf += importEntrySize(importFormat)` in the next line instead? feels a bit clearer to me – that way, writeImport has only a single responsibility.

(But a writing function returning how much it wrote is fairly common too, so up to you)


================
Comment at: lld/MachO/SyntheticSections.cpp:2203
+  assert(config->emitChainedFixups);
+
+  if (needsLargeAddend || !isUInt<23>(symtabSize))
----------------
Maybe add `// This is step 2 of the algorithm described in the class comment of ChainedFixupsSection.`


================
Comment at: lld/MachO/SyntheticSections.cpp:2204
+
+  if (needsLargeAddend || !isUInt<23>(symtabSize))
+    importFormat = DYLD_CHAINED_IMPORT_ADDEND64;
----------------
nit: Should something somewhere error out if `!isUInt<32>(symtabSize)` since even `dyld_chained_import_addend64::name_offset` is too small then?

(I doubt anything will ever hit that; otoh you check most other bounds and it's like 2 lines)


================
Comment at: lld/MachO/SyntheticSections.h:275
+//
+// Symbols are always bound eagerly when chained fixups are used. In that case,
+// StubSection contains indirect jumps to addresses stored in the GotSection.
----------------
No change needed here, but: They're bound eagerly at page-in time on new OS versions, which in a way means they're always bound lazily as-needed, right?

(Comment is fine as-is, I'm asking just to make sure I'm understanding things. And there are OS versions that have chained fixups but no page-in linking.)


================
Comment at: lld/MachO/SyntheticSections.h:702
+//
+// Userspace x86_64 and arm64 binaries and dylibs have two types of fixup
+// entries:
----------------
nit: dylibs are binaries too. s/binaries/executables/, but I'm guessing bundles (MH_BUNDLES) can have chained fixups as well, so maybe s/binaries and dylibs/binaries/ instead to cover all three.


================
Comment at: lld/MachO/SyntheticSections.h:731
+//   5. Finally, each page's (which might correspond to multiple sections)
+//      fixups are linked together in Writer::buildFixupChains().
+class ChainedFixupsSection final : public LinkEditSection {
----------------
Great comment :)


================
Comment at: lld/MachO/SyntheticSections.h:766
+  struct SegmentInfo {
+    SegmentInfo(const OutputSegment *oseg) : oseg(oseg){};
+
----------------
remove trailing extraneous `;`, then clang-format will do a better job of formatting this line

(…maybe we should build llvm with -Wextra-semi :) )


================
Comment at: lld/MachO/Writer.cpp:1058
+      in.weakBinding,    in.lazyBinding,
+      in.exports,        in.chainedFixups,
+      symtabSection,     indirectSymtabSection,
----------------
A bit unfortunate that clang-format wants to reformat this, makes it hard to see that the change here is the addition of `in.chainedFixups`. Even if you land the reformatting without the addition first, everything will shift around when you add it and it's still a bit hard to see (but maybe slightly easier since the change is on the first touched line then). So maybe precommit the format change (and the change to std::array – but see above :P)


================
Comment at: lld/MachO/Writer.cpp:1152
+    return;
+
+  const std::vector<Location> &loc = in.chainedFixups->getLocations();
----------------
Maybe add `// This is step 5 of the algorithm described in the class comment of ChainedFixupsSection.`


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D132560/new/

https://reviews.llvm.org/D132560



More information about the llvm-commits mailing list