[llvm] [CodeGen] Add an option to skip extTSP BB placement for huge functions. (PR #99310)

Krzysztof Pszeniczny via llvm-commits llvm-commits at lists.llvm.org
Wed Jul 17 04:50:06 PDT 2024


https://github.com/amharc created https://github.com/llvm/llvm-project/pull/99310

The extTSP-based basic block layout algorithm improves the performance of the generated code, but unfortunately it has a super-linear time complexity. This leads to extremely long compilation times for certain relatively rare kinds of autogenerated code.

This patch adds an `-mllvm` flag to optionally restrict extTSP only to functions smaller than a specified threshold. While commit
bcdc0477319a26fd8dcdde5ace3bdd6743599f44 added a knob to to limit the maximum chain size, it's still possible that for certain huge functions the number of chains is very large, leading to a quadratic behaviour in ExtTSPImpl::mergeChainPairs.

>From 532c1c8ce887d23ba1c35e051951da1167a5c2a6 Mon Sep 17 00:00:00 2001
From: Krzysztof Pszeniczny <kpszeniczny at google.com>
Date: Wed, 17 Jul 2024 13:31:18 +0200
Subject: [PATCH] [CodeGen] Add an option to skip extTSP BB placement for huge
 functions.

The extTSP-based basic block layout algorithm improves the performance
of the generated code, but unfortunately it has a super-linear time
complexity. This leads to extremely long compilation times for certain
kinds of autogenerated code.

This patch adds an -mllvm flag to restrict extTSP only to functions
smaller than a specified threshold. While commit
bcdc0477319a26fd8dcdde5ace3bdd6743599f44 added a knob to to limit the
maximum chain size, it's still possible that for certain huge functions
the number of chains is very large, leading to a quadratic behaviour
in ExtTSPImpl::mergeChainPairs.
---
 llvm/lib/CodeGen/MachineBlockPlacement.cpp        |  9 ++++++++-
 .../CodeGen/X86/code_placement_ext_tsp_large.ll   | 15 +++++++++++++++
 2 files changed, 23 insertions(+), 1 deletion(-)

diff --git a/llvm/lib/CodeGen/MachineBlockPlacement.cpp b/llvm/lib/CodeGen/MachineBlockPlacement.cpp
index 4c864ca15ccc5..2f8e6adc43103 100644
--- a/llvm/lib/CodeGen/MachineBlockPlacement.cpp
+++ b/llvm/lib/CodeGen/MachineBlockPlacement.cpp
@@ -213,6 +213,12 @@ static cl::opt<bool> RenumberBlocksBeforeView(
         "into a dot graph. Only used when a function is being printed."),
     cl::init(false), cl::Hidden);
 
+static cl::opt<unsigned> ExtTspBlockPlacementMaxBlocks(
+    "ext-tsp-block-placement-max-blocks",
+    cl::desc("Maximum number of basic blocks in a function to run ext-TSP "
+             "block placement."),
+    cl::init(UINT_MAX), cl::Hidden);
+
 namespace llvm {
 extern cl::opt<bool> EnableExtTspBlockPlacement;
 extern cl::opt<bool> ApplyExtTspWithoutProfile;
@@ -3511,7 +3517,8 @@ bool MachineBlockPlacement::runOnMachineFunction(MachineFunction &MF) {
 
   // Apply a post-processing optimizing block placement.
   if (MF.size() >= 3 && EnableExtTspBlockPlacement &&
-      (ApplyExtTspWithoutProfile || MF.getFunction().hasProfileData())) {
+      (ApplyExtTspWithoutProfile || MF.getFunction().hasProfileData()) &&
+      MF.size() <= ExtTspBlockPlacementMaxBlocks) {
     // Find a new placement and modify the layout of the blocks in the function.
     applyExtTsp();
 
diff --git a/llvm/test/CodeGen/X86/code_placement_ext_tsp_large.ll b/llvm/test/CodeGen/X86/code_placement_ext_tsp_large.ll
index 842aced4884f7..ac172d32c6d8b 100644
--- a/llvm/test/CodeGen/X86/code_placement_ext_tsp_large.ll
+++ b/llvm/test/CodeGen/X86/code_placement_ext_tsp_large.ll
@@ -2,6 +2,7 @@
 ; RUN: llc -mcpu=corei7 -mtriple=x86_64-linux -enable-ext-tsp-block-placement=1 -ext-tsp-chain-split-threshold=128 -debug-only=block-placement < %s 2>&1 | FileCheck %s
 ; RUN: llc -mcpu=corei7 -mtriple=x86_64-linux -enable-ext-tsp-block-placement=1 -ext-tsp-chain-split-threshold=1 -debug-only=block-placement < %s 2>&1 | FileCheck %s -check-prefix=CHECK2
 ; RUN: llc -mcpu=corei7 -mtriple=x86_64-linux -enable-ext-tsp-block-placement=0 -debug-only=block-placement < %s 2>&1 | FileCheck %s -check-prefix=CHECK3
+; RUN: llc -mcpu=corei7 -mtriple=x86_64-linux -enable-ext-tsp-block-placement=1 -ext-tsp-block-placement-max-blocks=8 -debug-only=block-placement < %s 2>&1 | FileCheck %s -check-prefix=CHECK4
 
 @yydebug = dso_local global i32 0, align 4
 
@@ -110,6 +111,20 @@ define void @func_large() !prof !0 {
 ; CHECK3: b7
 ; CHECK3: b8
 ; CHECK3: b9
+;
+; An expected output with function size larger than the threshold -- the layout is not modified:
+;
+; CHECK4-LABEL: func_large:
+; CHECK4: b0
+; CHECK4: b1
+; CHECK4: b2
+; CHECK4: b3
+; CHECK4: b4
+; CHECK4: b5
+; CHECK4: b6
+; CHECK4: b7
+; CHECK4: b8
+; CHECK4: b9
 
 b0:
   %0 = load i32, ptr @yydebug, align 4



More information about the llvm-commits mailing list