[llvm] [MachineOutliner] Efficient Implementation of MachineOutliner::findCandidates() (PR #88988)
Xuan Zhang via llvm-commits
llvm-commits at lists.llvm.org
Tue Apr 16 14:14:44 PDT 2024
https://github.com/xuanzh-meta created https://github.com/llvm/llvm-project/pull/88988
We reduce the complexity of the main loop of findCandidates() method from O(n^2) to O(n log n).
We sort RS.StartIndices in SuffixTree and change replace find_if function with a simple check.
For each SuffixTree::RepeatedSubstring RS, the time complexity to find a set of candidates that do not overlap with each other is O(n^2) where n is the number of occurrence of this repeated substring (i.e., the size of RS.StartIndices). This is due to the use of the find_if method which has complexity O(n). The quadratic runtime becomes a problem when n gets larger. For clang, with [3], the maximum n goes from 12k to 100k, and the time to complete the main loop in MachineOutliner::findCandidates() goes from 17 seconds to 120 seconds.
To improve the runtime, we implement a more efficient algorithm with complexity O(n log n), using the fact that once RS.StartIndices is sorted, find_if can be achieved with an O(1) check. The O(n log n) complexity comes from the sorting. For clang, with [1], the time to complete the loop is reduced to only 28 seconds.
>From 84db82fdb07928e5ba59a4f6e652dbfcfcd3acd7 Mon Sep 17 00:00:00 2001
From: Xuan Zhang <xuanzh at meta.com>
Date: Fri, 12 Apr 2024 09:59:15 -0700
Subject: [PATCH] efficient implementation of MachineOutliner::findCandidates()
---
llvm/lib/CodeGen/MachineOutliner.cpp | 11 ++---
llvm/lib/Support/SuffixTree.cpp | 5 +++
.../Analysis/IRSimilarityIdentifier/basic.ll | 26 ++++++------
.../IRSimilarityIdentifier/different.ll | 6 +--
.../IROutliner/outlining-commutative.ll | 20 +++++-----
llvm/test/tools/llvm-sim/single-sim-file.test | 40 +++++++++----------
llvm/test/tools/llvm-sim/single-sim.test | 40 +++++++++----------
7 files changed, 75 insertions(+), 73 deletions(-)
diff --git a/llvm/lib/CodeGen/MachineOutliner.cpp b/llvm/lib/CodeGen/MachineOutliner.cpp
index dc2f5ef15206e8..e682d42c76747e 100644
--- a/llvm/lib/CodeGen/MachineOutliner.cpp
+++ b/llvm/lib/CodeGen/MachineOutliner.cpp
@@ -616,17 +616,14 @@ void MachineOutliner::findCandidates(
// * End before the other starts
// * Start after the other ends
unsigned EndIdx = StartIdx + StringLen - 1;
- auto FirstOverlap = find_if(
- CandidatesForRepeatedSeq, [StartIdx, EndIdx](const Candidate &C) {
- return EndIdx >= C.getStartIdx() && StartIdx <= C.getEndIdx();
- });
- if (FirstOverlap != CandidatesForRepeatedSeq.end()) {
+ if (CandidatesForRepeatedSeq.size() > 0 &&
+ StartIdx <= CandidatesForRepeatedSeq.back().getEndIdx()) {
#ifndef NDEBUG
++NumDiscarded;
LLVM_DEBUG(dbgs() << " .. DISCARD candidate @ [" << StartIdx
<< ", " << EndIdx << "]; overlaps with candidate @ ["
- << FirstOverlap->getStartIdx() << ", "
- << FirstOverlap->getEndIdx() << "]\n");
+ << CandidatesForRepeatedSeq.back().getStartIdx() << ", "
+ << CandidatesForRepeatedSeq.back().getEndIdx() << "]\n");
#endif
continue;
}
diff --git a/llvm/lib/Support/SuffixTree.cpp b/llvm/lib/Support/SuffixTree.cpp
index eaa653078e0900..03ed1d02840aa1 100644
--- a/llvm/lib/Support/SuffixTree.cpp
+++ b/llvm/lib/Support/SuffixTree.cpp
@@ -274,6 +274,11 @@ void SuffixTree::RepeatedSubstringIterator::advance() {
RS.Length = Length;
for (unsigned StartIdx : RepeatedSubstringStarts)
RS.StartIndices.push_back(StartIdx);
+
+ // Sort the start indices so that we can efficiently check if candidates
+ // overlap with each other in MachineOutliner::findCandidates().
+ llvm::sort(RS.StartIndices);
+
break;
}
// At this point, either NewRS is an empty RepeatedSubstring, or it was
diff --git a/llvm/test/Analysis/IRSimilarityIdentifier/basic.ll b/llvm/test/Analysis/IRSimilarityIdentifier/basic.ll
index 1c08cb407c2e3c..b38e7d19973db6 100644
--- a/llvm/test/Analysis/IRSimilarityIdentifier/basic.ll
+++ b/llvm/test/Analysis/IRSimilarityIdentifier/basic.ll
@@ -4,7 +4,7 @@
; This is a simple test to make sure the IRSimilarityIdentifier and
; IRSimilarityPrinterPass is working.
-; CHECK: 4 candidates of length 6. Found in:
+; CHECK: 4 candidates of length 6. Found in:
; CHECK-NEXT: Function: turtle, Basic Block: (unnamed)
; CHECK-NEXT: Start Instruction: store i32 1, ptr %1, align 4
; CHECK-NEXT: End Instruction: store i32 6, ptr %6, align 4
@@ -17,7 +17,7 @@
; CHECK-NEXT: Function: dog, Basic Block: entry
; CHECK-NEXT: Start Instruction: store i32 6, ptr %0, align 4
; CHECK-NEXT: End Instruction: store i32 5, ptr %5, align 4
-; CHECK-NEXT:4 candidates of length 5. Found in:
+; CHECK-NEXT:4 candidates of length 5. Found in:
; CHECK-NEXT: Function: turtle, Basic Block: (unnamed)
; CHECK-NEXT: Start Instruction: store i32 2, ptr %2, align 4
; CHECK-NEXT: End Instruction: store i32 6, ptr %6, align 4
@@ -30,7 +30,7 @@
; CHECK-NEXT: Function: dog, Basic Block: entry
; CHECK-NEXT: Start Instruction: store i32 1, ptr %1, align 4
; CHECK-NEXT: End Instruction: store i32 5, ptr %5, align 4
-; CHECK-NEXT:4 candidates of length 4. Found in:
+; CHECK-NEXT:4 candidates of length 4. Found in:
; CHECK-NEXT: Function: turtle, Basic Block: (unnamed)
; CHECK-NEXT: Start Instruction: store i32 3, ptr %3, align 4
; CHECK-NEXT: End Instruction: store i32 6, ptr %6, align 4
@@ -43,7 +43,7 @@
; CHECK-NEXT: Function: dog, Basic Block: entry
; CHECK-NEXT: Start Instruction: store i32 2, ptr %2, align 4
; CHECK-NEXT: End Instruction: store i32 5, ptr %5, align 4
-; CHECK-NEXT:4 candidates of length 3. Found in:
+; CHECK-NEXT:4 candidates of length 3. Found in:
; CHECK-NEXT: Function: turtle, Basic Block: (unnamed)
; CHECK-NEXT: Start Instruction: store i32 4, ptr %4, align 4
; CHECK-NEXT: End Instruction: store i32 6, ptr %6, align 4
@@ -56,7 +56,7 @@
; CHECK-NEXT: Function: dog, Basic Block: entry
; CHECK-NEXT: Start Instruction: store i32 3, ptr %3, align 4
; CHECK-NEXT: End Instruction: store i32 5, ptr %5, align 4
-; CHECK-NEXT:4 candidates of length 2. Found in:
+; CHECK-NEXT:4 candidates of length 2. Found in:
; CHECK-NEXT: Function: turtle, Basic Block: (unnamed)
; CHECK-NEXT: Start Instruction: store i32 5, ptr %5, align 4
; CHECK-NEXT: End Instruction: store i32 6, ptr %6, align 4
@@ -70,40 +70,40 @@
; CHECK-NEXT: Start Instruction: store i32 4, ptr %4, align 4
; CHECK-NEXT: End Instruction: store i32 5, ptr %5, align 4
-define linkonce_odr void @fish() {
-entry:
- %0 = alloca i32, align 4
+define void @turtle() {
%1 = alloca i32, align 4
%2 = alloca i32, align 4
%3 = alloca i32, align 4
%4 = alloca i32, align 4
%5 = alloca i32, align 4
- store i32 6, ptr %0, align 4
+ %6 = alloca i32, align 4
store i32 1, ptr %1, align 4
store i32 2, ptr %2, align 4
store i32 3, ptr %3, align 4
store i32 4, ptr %4, align 4
store i32 5, ptr %5, align 4
+ store i32 6, ptr %6, align 4
ret void
}
-define void @turtle() {
+define void @cat() {
+entry:
+ %0 = alloca i32, align 4
%1 = alloca i32, align 4
%2 = alloca i32, align 4
%3 = alloca i32, align 4
%4 = alloca i32, align 4
%5 = alloca i32, align 4
- %6 = alloca i32, align 4
+ store i32 6, ptr %0, align 4
store i32 1, ptr %1, align 4
store i32 2, ptr %2, align 4
store i32 3, ptr %3, align 4
store i32 4, ptr %4, align 4
store i32 5, ptr %5, align 4
- store i32 6, ptr %6, align 4
ret void
}
-define void @cat() {
+define linkonce_odr void @fish() {
entry:
%0 = alloca i32, align 4
%1 = alloca i32, align 4
diff --git a/llvm/test/Analysis/IRSimilarityIdentifier/different.ll b/llvm/test/Analysis/IRSimilarityIdentifier/different.ll
index e5c9970b159b9f..70d422077c3e9c 100644
--- a/llvm/test/Analysis/IRSimilarityIdentifier/different.ll
+++ b/llvm/test/Analysis/IRSimilarityIdentifier/different.ll
@@ -14,11 +14,11 @@
; CHECK-NEXT: End Instruction: store i32 5, ptr %5, align 4
; CHECK-NEXT: 2 candidates of length 3. Found in:
; CHECK-NEXT: Function: turtle, Basic Block: (unnamed)
-; CHECK-NEXT: Start Instruction: %b = load i32, ptr %1, align 4
-; CHECK-NEXT: End Instruction: %d = load i32, ptr %3, align 4
-; CHECK-NEXT: Function: turtle, Basic Block: (unnamed)
; CHECK-NEXT: Start Instruction: %a = load i32, ptr %0, align 4
; CHECK-NEXT: End Instruction: %c = load i32, ptr %2, align 4
+; CHECK-NEXT: Function: turtle, Basic Block: (unnamed)
+; CHECK-NEXT: Start Instruction: %b = load i32, ptr %1, align 4
+; CHECK-NEXT: End Instruction: %d = load i32, ptr %3, align 4
define linkonce_odr void @fish() {
entry:
diff --git a/llvm/test/Transforms/IROutliner/outlining-commutative.ll b/llvm/test/Transforms/IROutliner/outlining-commutative.ll
index 8862dc295d4351..1534829bad7ba7 100644
--- a/llvm/test/Transforms/IROutliner/outlining-commutative.ll
+++ b/llvm/test/Transforms/IROutliner/outlining-commutative.ll
@@ -123,7 +123,7 @@ define void @outline_from_sub1() {
; CHECK-NEXT: [[A:%.*]] = alloca i32, align 4
; CHECK-NEXT: [[B:%.*]] = alloca i32, align 4
; CHECK-NEXT: [[C:%.*]] = alloca i32, align 4
-; CHECK-NEXT: call void @outlined_ir_func_2(ptr [[A]], ptr [[B]], ptr [[C]])
+; CHECK-NEXT: call void @outlined_ir_func_1(ptr [[A]], ptr [[B]], ptr [[C]])
; CHECK-NEXT: ret void
;
entry:
@@ -148,7 +148,7 @@ define void @outline_from_sub2() {
; CHECK-NEXT: [[A:%.*]] = alloca i32, align 4
; CHECK-NEXT: [[B:%.*]] = alloca i32, align 4
; CHECK-NEXT: [[C:%.*]] = alloca i32, align 4
-; CHECK-NEXT: call void @outlined_ir_func_2(ptr [[A]], ptr [[B]], ptr [[C]])
+; CHECK-NEXT: call void @outlined_ir_func_1(ptr [[A]], ptr [[B]], ptr [[C]])
; CHECK-NEXT: ret void
;
entry:
@@ -173,7 +173,7 @@ define void @dontoutline_from_flipped_sub3() {
; CHECK-NEXT: [[A:%.*]] = alloca i32, align 4
; CHECK-NEXT: [[B:%.*]] = alloca i32, align 4
; CHECK-NEXT: [[C:%.*]] = alloca i32, align 4
-; CHECK-NEXT: call void @outlined_ir_func_1(ptr [[A]], ptr [[B]], ptr [[C]])
+; CHECK-NEXT: call void @outlined_ir_func_2(ptr [[A]], ptr [[B]], ptr [[C]])
; CHECK-NEXT: ret void
;
entry:
@@ -198,7 +198,7 @@ define void @dontoutline_from_flipped_sub4() {
; CHECK-NEXT: [[A:%.*]] = alloca i32, align 4
; CHECK-NEXT: [[B:%.*]] = alloca i32, align 4
; CHECK-NEXT: [[C:%.*]] = alloca i32, align 4
-; CHECK-NEXT: call void @outlined_ir_func_1(ptr [[A]], ptr [[B]], ptr [[C]])
+; CHECK-NEXT: call void @outlined_ir_func_2(ptr [[A]], ptr [[B]], ptr [[C]])
; CHECK-NEXT: ret void
;
entry:
@@ -237,9 +237,9 @@ entry:
; CHECK-NEXT: [[AL:%.*]] = load i32, ptr [[ARG0]], align 4
; CHECK-NEXT: [[BL:%.*]] = load i32, ptr [[ARG1]], align 4
; CHECK-NEXT: [[CL:%.*]] = load i32, ptr [[ARG2]], align 4
-; CHECK-NEXT: [[TMP0:%.*]] = sub i32 [[BL]], [[AL]]
-; CHECK-NEXT: [[TMP1:%.*]] = sub i32 [[CL]], [[AL]]
-; CHECK-NEXT: [[TMP2:%.*]] = sub i32 [[CL]], [[BL]]
+; CHECK-NEXT: [[TMP0:%.*]] = sub i32 [[AL]], [[BL]]
+; CHECK-NEXT: [[TMP1:%.*]] = sub i32 [[AL]], [[CL]]
+; CHECK-NEXT: [[TMP2:%.*]] = sub i32 [[BL]], [[CL]]
; CHECK: define internal void @outlined_ir_func_2(ptr [[ARG0:%.*]], ptr [[ARG1:%.*]], ptr [[ARG2:%.*]]) #0 {
; CHECK: entry_to_outline:
@@ -249,6 +249,6 @@ entry:
; CHECK-NEXT: [[AL:%.*]] = load i32, ptr [[ARG0]], align 4
; CHECK-NEXT: [[BL:%.*]] = load i32, ptr [[ARG1]], align 4
; CHECK-NEXT: [[CL:%.*]] = load i32, ptr [[ARG2]], align 4
-; CHECK-NEXT: [[TMP0:%.*]] = sub i32 [[AL]], [[BL]]
-; CHECK-NEXT: [[TMP1:%.*]] = sub i32 [[AL]], [[CL]]
-; CHECK-NEXT: [[TMP2:%.*]] = sub i32 [[BL]], [[CL]]
+; CHECK-NEXT: [[TMP0:%.*]] = sub i32 [[BL]], [[AL]]
+; CHECK-NEXT: [[TMP1:%.*]] = sub i32 [[CL]], [[AL]]
+; CHECK-NEXT: [[TMP2:%.*]] = sub i32 [[CL]], [[BL]]
diff --git a/llvm/test/tools/llvm-sim/single-sim-file.test b/llvm/test/tools/llvm-sim/single-sim-file.test
index cef14b36085005..4279931f36cdf2 100644
--- a/llvm/test/tools/llvm-sim/single-sim-file.test
+++ b/llvm/test/tools/llvm-sim/single-sim-file.test
@@ -6,52 +6,52 @@
# CHECK: {
# CHECK-NEXT: "1": [
# CHECK-NEXT: {
-# CHECK-NEXT: "start": 14,
-# CHECK-NEXT: "end": 19
-# CHECK-NEXT: },
-# CHECK-NEXT: {
# CHECK-NEXT: "start": 4,
# CHECK-NEXT: "end": 9
+# CHECK-NEXT: },
+# CHECK-NEXT: {
+# CHECK-NEXT: "start": 14,
+# CHECK-NEXT: "end": 19
# CHECK-NEXT: }
# CHECK-NEXT: ],
# CHECK-NEXT: "2": [
# CHECK-NEXT: {
-# CHECK-NEXT: "start": 15,
-# CHECK-NEXT: "end": 19
-# CHECK-NEXT: },
-# CHECK-NEXT: {
# CHECK-NEXT: "start": 5,
# CHECK-NEXT: "end": 9
+# CHECK-NEXT: },
+# CHECK-NEXT: {
+# CHECK-NEXT: "start": 15,
+# CHECK-NEXT: "end": 19
# CHECK-NEXT: }
# CHECK-NEXT: ],
# CHECK-NEXT: "3": [
# CHECK-NEXT: {
-# CHECK-NEXT: "start": 16,
-# CHECK-NEXT: "end": 19
-# CHECK-NEXT: },
-# CHECK-NEXT: {
# CHECK-NEXT: "start": 6,
# CHECK-NEXT: "end": 9
+# CHECK-NEXT: },
+# CHECK-NEXT: {
+# CHECK-NEXT: "start": 16,
+# CHECK-NEXT: "end": 19
# CHECK-NEXT: }
# CHECK-NEXT: ],
# CHECK-NEXT: "4": [
# CHECK-NEXT: {
-# CHECK-NEXT: "start": 17,
-# CHECK-NEXT: "end": 19
-# CHECK-NEXT: },
-# CHECK-NEXT: {
# CHECK-NEXT: "start": 7,
# CHECK-NEXT: "end": 9
+# CHECK-NEXT: },
+# CHECK-NEXT: {
+# CHECK-NEXT: "start": 17,
+# CHECK-NEXT: "end": 19
# CHECK-NEXT: }
# CHECK-NEXT: ],
# CHECK-NEXT: "5": [
# CHECK-NEXT: {
-# CHECK-NEXT: "start": 18,
-# CHECK-NEXT: "end": 19
-# CHECK-NEXT: },
-# CHECK-NEXT: {
# CHECK-NEXT: "start": 8,
# CHECK-NEXT: "end": 9
+# CHECK-NEXT: },
+# CHECK-NEXT: {
+# CHECK-NEXT: "start": 18,
+# CHECK-NEXT: "end": 19
# CHECK-NEXT: }
# CHECK-NEXT: ]
# CHECK-NEXT:}
diff --git a/llvm/test/tools/llvm-sim/single-sim.test b/llvm/test/tools/llvm-sim/single-sim.test
index 0095ec6acbc588..3300b5cbda31a5 100644
--- a/llvm/test/tools/llvm-sim/single-sim.test
+++ b/llvm/test/tools/llvm-sim/single-sim.test
@@ -5,52 +5,52 @@
# CHECK: {
# CHECK-NEXT: "1": [
# CHECK-NEXT: {
-# CHECK-NEXT: "start": 14,
-# CHECK-NEXT: "end": 19
-# CHECK-NEXT: },
-# CHECK-NEXT: {
# CHECK-NEXT: "start": 4,
# CHECK-NEXT: "end": 9
+# CHECK-NEXT: },
+# CHECK-NEXT: {
+# CHECK-NEXT: "start": 14,
+# CHECK-NEXT: "end": 19
# CHECK-NEXT: }
# CHECK-NEXT: ],
# CHECK-NEXT: "2": [
# CHECK-NEXT: {
-# CHECK-NEXT: "start": 15,
-# CHECK-NEXT: "end": 19
-# CHECK-NEXT: },
-# CHECK-NEXT: {
# CHECK-NEXT: "start": 5,
# CHECK-NEXT: "end": 9
+# CHECK-NEXT: },
+# CHECK-NEXT: {
+# CHECK-NEXT: "start": 15,
+# CHECK-NEXT: "end": 19
# CHECK-NEXT: }
# CHECK-NEXT: ],
# CHECK-NEXT: "3": [
# CHECK-NEXT: {
-# CHECK-NEXT: "start": 16,
-# CHECK-NEXT: "end": 19
-# CHECK-NEXT: },
-# CHECK-NEXT: {
# CHECK-NEXT: "start": 6,
# CHECK-NEXT: "end": 9
+# CHECK-NEXT: },
+# CHECK-NEXT: {
+# CHECK-NEXT: "start": 16,
+# CHECK-NEXT: "end": 19
# CHECK-NEXT: }
# CHECK-NEXT: ],
# CHECK-NEXT: "4": [
# CHECK-NEXT: {
-# CHECK-NEXT: "start": 17,
-# CHECK-NEXT: "end": 19
-# CHECK-NEXT: },
-# CHECK-NEXT: {
# CHECK-NEXT: "start": 7,
# CHECK-NEXT: "end": 9
+# CHECK-NEXT: },
+# CHECK-NEXT: {
+# CHECK-NEXT: "start": 17,
+# CHECK-NEXT: "end": 19
# CHECK-NEXT: }
# CHECK-NEXT: ],
# CHECK-NEXT: "5": [
# CHECK-NEXT: {
-# CHECK-NEXT: "start": 18,
-# CHECK-NEXT: "end": 19
-# CHECK-NEXT: },
-# CHECK-NEXT: {
# CHECK-NEXT: "start": 8,
# CHECK-NEXT: "end": 9
+# CHECK-NEXT: },
+# CHECK-NEXT: {
+# CHECK-NEXT: "start": 18,
+# CHECK-NEXT: "end": 19
# CHECK-NEXT: }
# CHECK-NEXT: ]
# CHECK-NEXT:}
More information about the llvm-commits
mailing list