[PATCH] D121342: [lld-macho] Align cstrings less conservatively

Jez Ng via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Mar 9 19:58:15 PST 2022


int3 marked 4 inline comments as done.
int3 added inline comments.


================
Comment at: lld/MachO/SyntheticSections.cpp:1375-1378
+// We differ slightly from ld64 in how we've chosen to align these cstrings.
+// Both LLD and ld64 preserve the number of trailing zeros in each cstring's
+// address in the input object files. When deduplicating identical cstrings,
+// both linkers pick the cstring whose address has more trailing zeros, and
----------------
oontvoo wrote:
> should we note soemthing about this in the lld-vs-ld64.rst doc?
> (Not this implementation detail particularly but rather the possible observable difference in the final binaries)
Done (but with a rather brief note). I don't think it's worth diving into the specifics of e.g. `16 * k + 2` (maybe that's what you meant by 'implementation detail'), that would just confuse most end users... but yeah, on the off chance that some build depends on this exact behavior, it could help someone triage the problem. Thanks!


================
Comment at: lld/MachO/SyntheticSections.cpp:1419-1429
+      auto s = isec->getCachedHashStringRef(i);
+      auto it = stringOffsetMap.find(s);
+      assert(it != stringOffsetMap.end());
+      uint64_t &outSecOff = it->second.outSecOff;
+      if (outSecOff == UINT64_MAX) {
+        outSecOff = alignTo(size, 1 << it->second.trailingZeros);
+        size = outSecOff + s.size();
----------------
oontvoo wrote:
> how's this? 
> just to make it a bit clearer you're updating the entry - otherwise it's a bit easy to miss at first glance.
> 
> (related nit: could have a static constant (eg., TombStone) so we dont have to remember using UINT64_MAX)
excellent suggestion, it's clearer this way :)

re static constant, we already use `UINT*_MAX` as a sentinel value in a lot of other places in the code (both in LLD/ELF and LLD/MachO), so I think it's a pretty common pattern overall and not likely to cause confusion


================
Comment at: lld/MachO/SyntheticSections.h:542
+    uint8_t trailingZeros;
+    uint64_t outSecOff = UINT64_MAX;
+
----------------
oontvoo wrote:
> (consistency with "isec")
`outSecOff` is already used in a lot of other places though, so I'm just following suit here. Arguably it's inconsistent with `isec`, but the counterargument is that member names should be more explicit/verbose than local variable names.

FWIW, LLD/ELF uses `outSecOff` together with `isec` as well.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D121342/new/

https://reviews.llvm.org/D121342



More information about the llvm-commits mailing list