[PATCH] D131247: [ELF] Parallelize writes of different OutputSections

Andrew Ng via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Aug 15 10:06:48 PDT 2022


andrewng added a comment.

I've looked a bit more at the Windows performance degradation and have come up with the following code for `taskSize` and `asyncParallelFor`:

  size_t tasks = size / (4 * 1024 * 1024);
  size_t taskSize = tasks ? sections.size() / tasks : sections.size();
  asyncParallelFor(tg, std::max<size_t>(1, taskSize), 0, sections.size(), fn);

The `4MB` is somewhat arbitrary but this appears to work OK on my Windows PC and mostly eliminates the performance degradation that I've seen so far. In fact there's a ~3% improvement for `mozilla` from `lld-speed-test.tar.xz`.

I've also tried to test on Linux, although only with an Ubuntu 22.04.1 VM on my Windows PC. I seem to see a similar performance degradation for `scylla` and `mozilla` (and the UE4 based link too). @MaskRay, could you please try testing `scylla` and `mozilla` to see if you can reproduce the performance degradation? The above patch also improves the situation for my setup and actually results in performance improvements for the problematic test cases.

Not really too sure what the next steps should be for this review. Parallel optimisations of this nature are always going to be somewhat tricky across platforms.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D131247/new/

https://reviews.llvm.org/D131247



More information about the llvm-commits mailing list