[llvm] [CodeLayout] cache-directed sort: limit max chain size (PR #69039)

Fangrui Song via llvm-commits llvm-commits at lists.llvm.org
Mon Oct 16 14:43:49 PDT 2023


MaskRay wrote:

> Can you share perf. improvement numbers to compare with results in #68638?
> 
> BTW, I have a local patch to improve performance for larger chains (based on my speculation [reviews.llvm.org/D152834#inline-1500317](https://reviews.llvm.org/D152834#inline-1500317)). If this can wait for a while, I will try to send a PR in one week.

Personally I suspect that changing the max-chain-size limit has a very limited impact. Is there a script so that everybody can verify https://reviews.llvm.org/D152834#inline-1500317 and check performance improvement? I am only able to benchmark Clang if someone shares with me input data. (For instance, I can build llvm-project or Linux kernel with a new version of Clang).

AIUI `llvm::erase_value(HotChains, From);` is just to support `LLVM_DEBUG(... HotChains.size())`. If I comment it out:
```
--- i/llvm/lib/Transforms/Utils/CodeLayout.cpp
+++ w/llvm/lib/Transforms/Utils/CodeLayout.cpp
@@ -1305,3 +1305,3 @@ private:
     // Remove the chain from the list of active chains.
-    llvm::erase_value(HotChains, From);
+    //llvm::erase_value(HotChains, From);
   }
```

```
% time /tmp/out/custom-gcc/bin/ld.lld @response.txt --call-graph-profile-sort=cdsort --threads=8 2> 0
/tmp/out/custom-gcc/bin/ld.lld @response.txt --call-graph-profile-sort=cdsort  544.51s user 4.70s system 102% cpu 8:54.62 total
```

Without any change, `--call-graph-profile-sort=cdsort` takes about 9 minutes. Therefore, `llvm::erase_value(HotChains, From);` is problematic but is not a bottleneck.

https://github.com/llvm/llvm-project/pull/69039


More information about the llvm-commits mailing list