[llvm] [CodeLayout] cache-directed sort: limit max chain size (PR #69039)
Fangrui Song via llvm-commits
llvm-commits at lists.llvm.org
Fri Oct 13 19:04:20 PDT 2023
https://github.com/MaskRay created https://github.com/llvm/llvm-project/pull/69039
When linking an executable with a slightly larger executable,
ld.lld --call-graph-profile-sort=cdsort can be very slow (see #68638).
```
4.6% 20.7Mi .text.hot
3.5% 15.9Mi .text
3.4% 15.2Mi .text.unknown
```
Add cl option `cds-max-chain-size`, which is similar to
`ext-tsp-max-chain-size`, and set it to 128, to improve performance.
In `ld.lld @response.txt --threads=4 --call-graph-profile-sort=cdsort --time-trace"
builds, the "Total Sort sections" time is measured as follows:
* -mllvm -cds-max-chain-size=64: 1.321813
* -mllvm -cds-max-chain-size=128: 2.030425
* -mllvm -cds-max-chain-size=256: 2.927684
* -mllvm -cds-max-chain-size=512: 5.493106
* unlimited: 9 minutes
The rest part takes 6.8s.
>From a3c979a24e7d5b898bd46923dcb4e049b598a642 Mon Sep 17 00:00:00 2001
From: Fangrui Song <i at maskray.me>
Date: Fri, 13 Oct 2023 18:15:24 -0700
Subject: [PATCH] [CodeLayout] cache-directed sort: limit max chain size
When linking an executable with a slightly larger executable,
ld.lld --call-graph-profile-sort=cdsort can be very slow (see #68638).
```
4.6% 20.7Mi .text.hot
3.5% 15.9Mi .text
3.4% 15.2Mi .text.unknown
```
Add cl option `cds-max-chain-size`, which is similar to
`ext-tsp-max-chain-size`, and set it to 128, to improve performance.
In `ld.lld @response.txt --threads=4 --call-graph-profile-sort=cdsort --time-trace"
builds, the "Total Sort sections" time is measured as follows:
* -mllvm -cds-max-chain-size=64: 1.321813
* -mllvm -cds-max-chain-size=128: 2.030425
* -mllvm -cds-max-chain-size=256: 2.927684
* -mllvm -cds-max-chain-size=512: 5.493106
* unlimited: 9 minutes
The rest part takes 6.8s.
---
llvm/lib/Transforms/Utils/CodeLayout.cpp | 9 +++++++++
llvm/unittests/Transforms/Utils/CodeLayoutTest.cpp | 13 +++++++++++++
2 files changed, 22 insertions(+)
diff --git a/llvm/lib/Transforms/Utils/CodeLayout.cpp b/llvm/lib/Transforms/Utils/CodeLayout.cpp
index d9e302d8b4fa54d..5411cf7a6c7f6f8 100644
--- a/llvm/lib/Transforms/Utils/CodeLayout.cpp
+++ b/llvm/lib/Transforms/Utils/CodeLayout.cpp
@@ -62,6 +62,12 @@ cl::opt<bool> ApplyExtTspWithoutProfile(
"ext-tsp-apply-without-profile",
cl::desc("Whether to apply ext-tsp placement for instances w/o profile"),
cl::init(true), cl::Hidden);
+
+namespace codelayout {
+cl::opt<unsigned>
+ CDMaxChainSize("cds-max-chain-size", cl::Hidden, cl::init(128),
+ cl::desc("The maximum size of a chain to create"));
+}
} // namespace llvm
// Algorithm-specific params for Ext-TSP. The values are tuned for the best
@@ -1158,6 +1164,9 @@ class CDSortImpl {
// Ignore loop edges.
if (Edge->isSelfEdge())
continue;
+ if (Edge->srcChain()->numBlocks() + Edge->dstChain()->numBlocks() >
+ CDMaxChainSize)
+ continue;
// Compute the gain of merging the two chains.
MergeGainT Gain = getBestMergeGain(Edge);
diff --git a/llvm/unittests/Transforms/Utils/CodeLayoutTest.cpp b/llvm/unittests/Transforms/Utils/CodeLayoutTest.cpp
index ce42f703229bd01..f2ac74df9afa35b 100644
--- a/llvm/unittests/Transforms/Utils/CodeLayoutTest.cpp
+++ b/llvm/unittests/Transforms/Utils/CodeLayoutTest.cpp
@@ -1,4 +1,5 @@
#include "llvm/Transforms/Utils/CodeLayout.h"
+#include "llvm/Support/CommandLine.h"
#include "gmock/gmock.h"
#include "gtest/gtest.h"
#include <vector>
@@ -7,6 +8,10 @@ using namespace llvm;
using namespace llvm::codelayout;
using testing::ElementsAreArray;
+namespace llvm::codelayout {
+extern cl::opt<unsigned> CDMaxChainSize;
+}
+
namespace {
TEST(CodeLayout, ThreeFunctions) {
// Place the most likely successor (2) first.
@@ -40,6 +45,14 @@ TEST(CodeLayout, HotChain) {
const std::vector<uint64_t> CallOffsets(std::size(Edges), 5);
auto Order = computeCacheDirectedLayout(Sizes, Counts, Edges, CallOffsets);
EXPECT_THAT(Order, ElementsAreArray({0, 3, 4, 2, 1}));
+
+ // -cds-max-chain-size disables forming a larger chain and therefore may
+ // change the result.
+ unsigned Saved = CDMaxChainSize;
+ CDMaxChainSize.setValue(3);
+ Order = computeCacheDirectedLayout(Sizes, Counts, Edges, CallOffsets);
+ EXPECT_THAT(Order, ElementsAreArray({0, 3, 4, 1, 2}));
+ CDMaxChainSize.setValue(Saved);
}
}
More information about the llvm-commits
mailing list