[llvm] [CodeGen] Add an option to skip extTSP BB placement for huge functions. (PR #99310)
Krzysztof Pszeniczny via llvm-commits
llvm-commits at lists.llvm.org
Wed Jul 17 04:50:06 PDT 2024
https://github.com/amharc created https://github.com/llvm/llvm-project/pull/99310
The extTSP-based basic block layout algorithm improves the performance of the generated code, but unfortunately it has a super-linear time complexity. This leads to extremely long compilation times for certain relatively rare kinds of autogenerated code.
This patch adds an `-mllvm` flag to optionally restrict extTSP only to functions smaller than a specified threshold. While commit
bcdc0477319a26fd8dcdde5ace3bdd6743599f44 added a knob to to limit the maximum chain size, it's still possible that for certain huge functions the number of chains is very large, leading to a quadratic behaviour in ExtTSPImpl::mergeChainPairs.
>From 532c1c8ce887d23ba1c35e051951da1167a5c2a6 Mon Sep 17 00:00:00 2001
From: Krzysztof Pszeniczny <kpszeniczny at google.com>
Date: Wed, 17 Jul 2024 13:31:18 +0200
Subject: [PATCH] [CodeGen] Add an option to skip extTSP BB placement for huge
functions.
The extTSP-based basic block layout algorithm improves the performance
of the generated code, but unfortunately it has a super-linear time
complexity. This leads to extremely long compilation times for certain
kinds of autogenerated code.
This patch adds an -mllvm flag to restrict extTSP only to functions
smaller than a specified threshold. While commit
bcdc0477319a26fd8dcdde5ace3bdd6743599f44 added a knob to to limit the
maximum chain size, it's still possible that for certain huge functions
the number of chains is very large, leading to a quadratic behaviour
in ExtTSPImpl::mergeChainPairs.
---
llvm/lib/CodeGen/MachineBlockPlacement.cpp | 9 ++++++++-
.../CodeGen/X86/code_placement_ext_tsp_large.ll | 15 +++++++++++++++
2 files changed, 23 insertions(+), 1 deletion(-)
diff --git a/llvm/lib/CodeGen/MachineBlockPlacement.cpp b/llvm/lib/CodeGen/MachineBlockPlacement.cpp
index 4c864ca15ccc5..2f8e6adc43103 100644
--- a/llvm/lib/CodeGen/MachineBlockPlacement.cpp
+++ b/llvm/lib/CodeGen/MachineBlockPlacement.cpp
@@ -213,6 +213,12 @@ static cl::opt<bool> RenumberBlocksBeforeView(
"into a dot graph. Only used when a function is being printed."),
cl::init(false), cl::Hidden);
+static cl::opt<unsigned> ExtTspBlockPlacementMaxBlocks(
+ "ext-tsp-block-placement-max-blocks",
+ cl::desc("Maximum number of basic blocks in a function to run ext-TSP "
+ "block placement."),
+ cl::init(UINT_MAX), cl::Hidden);
+
namespace llvm {
extern cl::opt<bool> EnableExtTspBlockPlacement;
extern cl::opt<bool> ApplyExtTspWithoutProfile;
@@ -3511,7 +3517,8 @@ bool MachineBlockPlacement::runOnMachineFunction(MachineFunction &MF) {
// Apply a post-processing optimizing block placement.
if (MF.size() >= 3 && EnableExtTspBlockPlacement &&
- (ApplyExtTspWithoutProfile || MF.getFunction().hasProfileData())) {
+ (ApplyExtTspWithoutProfile || MF.getFunction().hasProfileData()) &&
+ MF.size() <= ExtTspBlockPlacementMaxBlocks) {
// Find a new placement and modify the layout of the blocks in the function.
applyExtTsp();
diff --git a/llvm/test/CodeGen/X86/code_placement_ext_tsp_large.ll b/llvm/test/CodeGen/X86/code_placement_ext_tsp_large.ll
index 842aced4884f7..ac172d32c6d8b 100644
--- a/llvm/test/CodeGen/X86/code_placement_ext_tsp_large.ll
+++ b/llvm/test/CodeGen/X86/code_placement_ext_tsp_large.ll
@@ -2,6 +2,7 @@
; RUN: llc -mcpu=corei7 -mtriple=x86_64-linux -enable-ext-tsp-block-placement=1 -ext-tsp-chain-split-threshold=128 -debug-only=block-placement < %s 2>&1 | FileCheck %s
; RUN: llc -mcpu=corei7 -mtriple=x86_64-linux -enable-ext-tsp-block-placement=1 -ext-tsp-chain-split-threshold=1 -debug-only=block-placement < %s 2>&1 | FileCheck %s -check-prefix=CHECK2
; RUN: llc -mcpu=corei7 -mtriple=x86_64-linux -enable-ext-tsp-block-placement=0 -debug-only=block-placement < %s 2>&1 | FileCheck %s -check-prefix=CHECK3
+; RUN: llc -mcpu=corei7 -mtriple=x86_64-linux -enable-ext-tsp-block-placement=1 -ext-tsp-block-placement-max-blocks=8 -debug-only=block-placement < %s 2>&1 | FileCheck %s -check-prefix=CHECK4
@yydebug = dso_local global i32 0, align 4
@@ -110,6 +111,20 @@ define void @func_large() !prof !0 {
; CHECK3: b7
; CHECK3: b8
; CHECK3: b9
+;
+; An expected output with function size larger than the threshold -- the layout is not modified:
+;
+; CHECK4-LABEL: func_large:
+; CHECK4: b0
+; CHECK4: b1
+; CHECK4: b2
+; CHECK4: b3
+; CHECK4: b4
+; CHECK4: b5
+; CHECK4: b6
+; CHECK4: b7
+; CHECK4: b8
+; CHECK4: b9
b0:
%0 = load i32, ptr @yydebug, align 4
More information about the llvm-commits
mailing list