[PATCH] D110018: [lld-macho] Speed up markLive()

Mon Oct 18 20:22:10 PDT 2021

int3 added inline comments.

================
Comment at: lld/MachO/MarkLive.cpp:27

-// Set live bit on for each reachable chunk. Unmarked (unreachable)
-// InputSections will be ignored by Writer, so they will be excluded
-// from the final output.
-void markLive() {
-  TimeTraceScope timeScope("markLive");
+static const int poolSize = 16; // arbitrary - could be something else.
+static std::atomic<int> done(0);
----------------
`hardware_concurrency()` seems like a good initializer

================
Comment at: lld/MachO/MarkLive.cpp:30

-  // We build up a worklist of sections which have been marked as live. We only
-  // push into the worklist when we discover an unmarked section, and we mark
-  // as we push, so sections never appear twice in the list.
-  // Literal sections cannot contain references to other sections, so we only
-  // store ConcatInputSections in our worklist.
-  SmallVector<ConcatInputSection *, 256> worklist;
+struct Task {
+
----------------
hm, this seems more like a `Worker` or `Executor` to me, each of which executes a number of tasks or jobs...

================
Comment at: lld/MachO/MarkLive.cpp:36
+      : idx(idx), pool(queues) {
+    // Calcuate the portion of inputSections this task were to own.
+    // We divide inputSections into poolSize groups.
----------------

================
Comment at: lld/MachO/MarkLive.cpp:43-44
+      if (idx < inputSections.size()) {
+        start = end = idx;
+        skip = false;
+      } else {
----------------
if `start` and `end` denoted half-open (instead of closed) intervals, would we still need `skip`?

also, how do you feel about creating an array of `ArrayRef`s here, instead of working with raw indices? That uses a few extra ints, but given that the number of threads in the pool is small, that shouldn't be an issue.

================
Comment at: lld/MachO/MarkLive.cpp:77
+
+  void help(size_t targetIdx) {
+    do {
----------------
would prefer a more descriptive name... `run` / `execute`?

================
Comment at: lld/MachO/MarkLive.cpp:97-120
+      // TODO:  we can avoid iterating inputSections multiple times.
+      //
+      // S_ATTR_LIVE_SUPPORT sections are live if they point _to_ a live
+      // section.
+      // Process them in a second pass.
+      if (!skip) {
+        for (size_t i = start; i <= end; ++i) {
----------------
I'm kind of suspicious of this. It's marking sections as live based on the liveness of other sections, which are concurrently being marked as live. Depending on the order of writes, it may incorrectly conclude that a certain live section X is not live, and therefore not mark the sections that X points to as live.

================
Comment at: lld/MachO/MarkLive.cpp:146
+
+  void work() { help(idx); }
+
----------------
`work` is usually a noun, not a verb... maybe this could also be `run()` (or have `run` above take a default argument)

================
Comment at: lld/MachO/MarkLive.cpp:148
+
+  const size_t d = inputSections.size() / poolSize;
+  const size_t idx;      // index of this task's queue in the pool
----------------
can we have a more descriptive name, like `groupSize`?

================
Comment at: lld/MachO/MarkLive.cpp:150
+  const size_t idx;      // index of this task's queue in the pool
+  bool markDone = false; // if true, this thread has signald that it's done.
+  size_t start;          // start inputsect
----------------

================
Comment at: lld/MachO/MarkLive.cpp:171-172
+
+    if (start < idx || end >= size) {
+      if (idx < size) {
+        start = end = idx;
----------------
i'm kind of confused here. why are we comparing `start` to `idx` -- isn't `start` an index into the `size` array of jobs, whereas `idx` is the index of the `Task` vector of size `poolSize`? seems like we're comparing numbers on different scales...

same thing for `idx` and `size`

================
Comment at: lld/MachO/MarkLive.cpp:258
+
+void populateDeadSrips(std::vector<Task> &tasks) {
+  TimeTraceScope timeScope("populateDeadSrips");
----------------
I think the "no" makes it clearer :)

================
Comment at: lld/MachO/WorkStealingQueue.h:38
+private:
+  size_t cap_;
+  const size_t mask_;
----------------
the rest of the codebase doesn't use the `_` suffix; I think we can just use `capacity` here and have the getter be `getCapacity()`

================
Comment at: lld/unittests/MachOUnitTests/CMakeLists.txt:2
+add_lld_unittest(lldMachOUnitTests
+  WorkSteallingQueueTest.cpp        
+  )
----------------

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D110018/new/

https://reviews.llvm.org/D110018