[llvm] [CodeLayout] cache-directed sort: limit max chain size (PR #69039)
Fangrui Song via llvm-commits
llvm-commits at lists.llvm.org
Mon Oct 16 14:43:49 PDT 2023
MaskRay wrote:
> Can you share perf. improvement numbers to compare with results in #68638?
>
> BTW, I have a local patch to improve performance for larger chains (based on my speculation [reviews.llvm.org/D152834#inline-1500317](https://reviews.llvm.org/D152834#inline-1500317)). If this can wait for a while, I will try to send a PR in one week.
Personally I suspect that changing the max-chain-size limit has a very limited impact. Is there a script so that everybody can verify https://reviews.llvm.org/D152834#inline-1500317 and check performance improvement? I am only able to benchmark Clang if someone shares with me input data. (For instance, I can build llvm-project or Linux kernel with a new version of Clang).
AIUI `llvm::erase_value(HotChains, From);` is just to support `LLVM_DEBUG(... HotChains.size())`. If I comment it out:
```
--- i/llvm/lib/Transforms/Utils/CodeLayout.cpp
+++ w/llvm/lib/Transforms/Utils/CodeLayout.cpp
@@ -1305,3 +1305,3 @@ private:
// Remove the chain from the list of active chains.
- llvm::erase_value(HotChains, From);
+ //llvm::erase_value(HotChains, From);
}
```
```
% time /tmp/out/custom-gcc/bin/ld.lld @response.txt --call-graph-profile-sort=cdsort --threads=8 2> 0
/tmp/out/custom-gcc/bin/ld.lld @response.txt --call-graph-profile-sort=cdsort 544.51s user 4.70s system 102% cpu 8:54.62 total
```
Without any change, `--call-graph-profile-sort=cdsort` takes about 9 minutes. Therefore, `llvm::erase_value(HotChains, From);` is problematic but is not a bottleneck.
https://github.com/llvm/llvm-project/pull/69039
More information about the llvm-commits
mailing list