[llvm] [CodeLayout] cache-directed sort: limit max chain size (PR #69039)

Fangrui Song via llvm-commits llvm-commits at lists.llvm.org
Fri Oct 13 19:04:20 PDT 2023


https://github.com/MaskRay created https://github.com/llvm/llvm-project/pull/69039

When linking an executable with a slightly larger executable,
ld.lld --call-graph-profile-sort=cdsort can be very slow (see #68638).
```
   4.6%  20.7Mi    .text.hot
   3.5%  15.9Mi    .text
   3.4%  15.2Mi    .text.unknown
```

Add cl option `cds-max-chain-size`, which is similar to
`ext-tsp-max-chain-size`, and set it to 128, to improve performance.

In `ld.lld @response.txt --threads=4 --call-graph-profile-sort=cdsort --time-trace"
builds, the "Total Sort sections" time is measured as follows:

* -mllvm  -cds-max-chain-size=64: 1.321813
* -mllvm -cds-max-chain-size=128: 2.030425
* -mllvm -cds-max-chain-size=256: 2.927684
* -mllvm -cds-max-chain-size=512: 5.493106
* unlimited: 9 minutes

The rest part takes 6.8s.


>From a3c979a24e7d5b898bd46923dcb4e049b598a642 Mon Sep 17 00:00:00 2001
From: Fangrui Song <i at maskray.me>
Date: Fri, 13 Oct 2023 18:15:24 -0700
Subject: [PATCH] [CodeLayout] cache-directed sort: limit max chain size

When linking an executable with a slightly larger executable,
ld.lld --call-graph-profile-sort=cdsort can be very slow (see #68638).
```
   4.6%  20.7Mi    .text.hot
   3.5%  15.9Mi    .text
   3.4%  15.2Mi    .text.unknown
```

Add cl option `cds-max-chain-size`, which is similar to
`ext-tsp-max-chain-size`, and set it to 128, to improve performance.

In `ld.lld @response.txt --threads=4 --call-graph-profile-sort=cdsort --time-trace"
builds, the "Total Sort sections" time is measured as follows:

* -mllvm  -cds-max-chain-size=64: 1.321813
* -mllvm -cds-max-chain-size=128: 2.030425
* -mllvm -cds-max-chain-size=256: 2.927684
* -mllvm -cds-max-chain-size=512: 5.493106
* unlimited: 9 minutes

The rest part takes 6.8s.
---
 llvm/lib/Transforms/Utils/CodeLayout.cpp           |  9 +++++++++
 llvm/unittests/Transforms/Utils/CodeLayoutTest.cpp | 13 +++++++++++++
 2 files changed, 22 insertions(+)

diff --git a/llvm/lib/Transforms/Utils/CodeLayout.cpp b/llvm/lib/Transforms/Utils/CodeLayout.cpp
index d9e302d8b4fa54d..5411cf7a6c7f6f8 100644
--- a/llvm/lib/Transforms/Utils/CodeLayout.cpp
+++ b/llvm/lib/Transforms/Utils/CodeLayout.cpp
@@ -62,6 +62,12 @@ cl::opt<bool> ApplyExtTspWithoutProfile(
     "ext-tsp-apply-without-profile",
     cl::desc("Whether to apply ext-tsp placement for instances w/o profile"),
     cl::init(true), cl::Hidden);
+
+namespace codelayout {
+cl::opt<unsigned>
+    CDMaxChainSize("cds-max-chain-size", cl::Hidden, cl::init(128),
+                   cl::desc("The maximum size of a chain to create"));
+}
 } // namespace llvm
 
 // Algorithm-specific params for Ext-TSP. The values are tuned for the best
@@ -1158,6 +1164,9 @@ class CDSortImpl {
         // Ignore loop edges.
         if (Edge->isSelfEdge())
           continue;
+        if (Edge->srcChain()->numBlocks() + Edge->dstChain()->numBlocks() >
+            CDMaxChainSize)
+          continue;
 
         // Compute the gain of merging the two chains.
         MergeGainT Gain = getBestMergeGain(Edge);
diff --git a/llvm/unittests/Transforms/Utils/CodeLayoutTest.cpp b/llvm/unittests/Transforms/Utils/CodeLayoutTest.cpp
index ce42f703229bd01..f2ac74df9afa35b 100644
--- a/llvm/unittests/Transforms/Utils/CodeLayoutTest.cpp
+++ b/llvm/unittests/Transforms/Utils/CodeLayoutTest.cpp
@@ -1,4 +1,5 @@
 #include "llvm/Transforms/Utils/CodeLayout.h"
+#include "llvm/Support/CommandLine.h"
 #include "gmock/gmock.h"
 #include "gtest/gtest.h"
 #include <vector>
@@ -7,6 +8,10 @@ using namespace llvm;
 using namespace llvm::codelayout;
 using testing::ElementsAreArray;
 
+namespace llvm::codelayout {
+extern cl::opt<unsigned> CDMaxChainSize;
+}
+
 namespace {
 TEST(CodeLayout, ThreeFunctions) {
   // Place the most likely successor (2) first.
@@ -40,6 +45,14 @@ TEST(CodeLayout, HotChain) {
     const std::vector<uint64_t> CallOffsets(std::size(Edges), 5);
     auto Order = computeCacheDirectedLayout(Sizes, Counts, Edges, CallOffsets);
     EXPECT_THAT(Order, ElementsAreArray({0, 3, 4, 2, 1}));
+
+    // -cds-max-chain-size disables forming a larger chain and therefore may
+    // change the result.
+    unsigned Saved = CDMaxChainSize;
+    CDMaxChainSize.setValue(3);
+    Order = computeCacheDirectedLayout(Sizes, Counts, Edges, CallOffsets);
+    EXPECT_THAT(Order, ElementsAreArray({0, 3, 4, 1, 2}));
+    CDMaxChainSize.setValue(Saved);
   }
 }
 



More information about the llvm-commits mailing list