[llvm] [MachineOutliner] Sort Outlining Functions by Priority (PR #88990)

Xuan Zhang via llvm-commits llvm-commits at lists.llvm.org
Tue Apr 16 14:27:07 PDT 2024


https://github.com/xuanzh-meta created https://github.com/llvm/llvm-project/pull/88990

We changed the order in which functions are outlined in Machine Outliner.

In MachineOutliner::outline(), functions are sorted by their benefits first and then outlined one at a time greedily. Our investigation shows that sorting the functions by a different value, which we call priority, could be more beneficial. For clang, sorting the functions by priority reduces the binary size by 1.8% compared to the baseline; if we additionally consider leaf descendants to include more candidates, the binary size is reduced by 2.8% compared to the baseline.

The formula for priority is found via a black-box Bayesian optimization toolbox, and we have shown that using this formula for sorting consistently reduces uncompressed mobile app size within our company.


>From 84db82fdb07928e5ba59a4f6e652dbfcfcd3acd7 Mon Sep 17 00:00:00 2001
From: Xuan Zhang <xuanzh at meta.com>
Date: Fri, 12 Apr 2024 09:59:15 -0700
Subject: [PATCH 1/2] efficient implementation of
 MachineOutliner::findCandidates()

---
 llvm/lib/CodeGen/MachineOutliner.cpp          | 11 ++---
 llvm/lib/Support/SuffixTree.cpp               |  5 +++
 .../Analysis/IRSimilarityIdentifier/basic.ll  | 26 ++++++------
 .../IRSimilarityIdentifier/different.ll       |  6 +--
 .../IROutliner/outlining-commutative.ll       | 20 +++++-----
 llvm/test/tools/llvm-sim/single-sim-file.test | 40 +++++++++----------
 llvm/test/tools/llvm-sim/single-sim.test      | 40 +++++++++----------
 7 files changed, 75 insertions(+), 73 deletions(-)

diff --git a/llvm/lib/CodeGen/MachineOutliner.cpp b/llvm/lib/CodeGen/MachineOutliner.cpp
index dc2f5ef15206e8..e682d42c76747e 100644
--- a/llvm/lib/CodeGen/MachineOutliner.cpp
+++ b/llvm/lib/CodeGen/MachineOutliner.cpp
@@ -616,17 +616,14 @@ void MachineOutliner::findCandidates(
       // * End before the other starts
       // * Start after the other ends
       unsigned EndIdx = StartIdx + StringLen - 1;
-      auto FirstOverlap = find_if(
-          CandidatesForRepeatedSeq, [StartIdx, EndIdx](const Candidate &C) {
-            return EndIdx >= C.getStartIdx() && StartIdx <= C.getEndIdx();
-          });
-      if (FirstOverlap != CandidatesForRepeatedSeq.end()) {
+      if (CandidatesForRepeatedSeq.size() > 0 &&
+          StartIdx <= CandidatesForRepeatedSeq.back().getEndIdx()) {
 #ifndef NDEBUG
         ++NumDiscarded;
         LLVM_DEBUG(dbgs() << "    .. DISCARD candidate @ [" << StartIdx
                           << ", " << EndIdx << "]; overlaps with candidate @ ["
-                          << FirstOverlap->getStartIdx() << ", "
-                          << FirstOverlap->getEndIdx() << "]\n");
+                          << CandidatesForRepeatedSeq.back().getStartIdx() << ", "
+                          << CandidatesForRepeatedSeq.back().getEndIdx() << "]\n");
 #endif
         continue;
       }
diff --git a/llvm/lib/Support/SuffixTree.cpp b/llvm/lib/Support/SuffixTree.cpp
index eaa653078e0900..03ed1d02840aa1 100644
--- a/llvm/lib/Support/SuffixTree.cpp
+++ b/llvm/lib/Support/SuffixTree.cpp
@@ -274,6 +274,11 @@ void SuffixTree::RepeatedSubstringIterator::advance() {
     RS.Length = Length;
     for (unsigned StartIdx : RepeatedSubstringStarts)
       RS.StartIndices.push_back(StartIdx);
+
+    // Sort the start indices so that we can efficiently check if candidates
+    // overlap with each other in MachineOutliner::findCandidates().
+    llvm::sort(RS.StartIndices);
+
     break;
   }
   // At this point, either NewRS is an empty RepeatedSubstring, or it was
diff --git a/llvm/test/Analysis/IRSimilarityIdentifier/basic.ll b/llvm/test/Analysis/IRSimilarityIdentifier/basic.ll
index 1c08cb407c2e3c..b38e7d19973db6 100644
--- a/llvm/test/Analysis/IRSimilarityIdentifier/basic.ll
+++ b/llvm/test/Analysis/IRSimilarityIdentifier/basic.ll
@@ -4,7 +4,7 @@
 ; This is a simple test to make sure the IRSimilarityIdentifier and
 ; IRSimilarityPrinterPass is working.
 
-; CHECK: 4 candidates of length 6.  Found in: 
+; CHECK: 4 candidates of length 6.  Found in:
 ; CHECK-NEXT:  Function: turtle, Basic Block: (unnamed)
 ; CHECK-NEXT:    Start Instruction:   store i32 1, ptr %1, align 4
 ; CHECK-NEXT:      End Instruction:   store i32 6, ptr %6, align 4
@@ -17,7 +17,7 @@
 ; CHECK-NEXT:  Function: dog, Basic Block: entry
 ; CHECK-NEXT:    Start Instruction:   store i32 6, ptr %0, align 4
 ; CHECK-NEXT:      End Instruction:   store i32 5, ptr %5, align 4
-; CHECK-NEXT:4 candidates of length 5.  Found in: 
+; CHECK-NEXT:4 candidates of length 5.  Found in:
 ; CHECK-NEXT:  Function: turtle, Basic Block: (unnamed)
 ; CHECK-NEXT:    Start Instruction:   store i32 2, ptr %2, align 4
 ; CHECK-NEXT:      End Instruction:   store i32 6, ptr %6, align 4
@@ -30,7 +30,7 @@
 ; CHECK-NEXT:  Function: dog, Basic Block: entry
 ; CHECK-NEXT:    Start Instruction:   store i32 1, ptr %1, align 4
 ; CHECK-NEXT:      End Instruction:   store i32 5, ptr %5, align 4
-; CHECK-NEXT:4 candidates of length 4.  Found in: 
+; CHECK-NEXT:4 candidates of length 4.  Found in:
 ; CHECK-NEXT:  Function: turtle, Basic Block: (unnamed)
 ; CHECK-NEXT:    Start Instruction:   store i32 3, ptr %3, align 4
 ; CHECK-NEXT:      End Instruction:   store i32 6, ptr %6, align 4
@@ -43,7 +43,7 @@
 ; CHECK-NEXT:  Function: dog, Basic Block: entry
 ; CHECK-NEXT:    Start Instruction:   store i32 2, ptr %2, align 4
 ; CHECK-NEXT:      End Instruction:   store i32 5, ptr %5, align 4
-; CHECK-NEXT:4 candidates of length 3.  Found in: 
+; CHECK-NEXT:4 candidates of length 3.  Found in:
 ; CHECK-NEXT:  Function: turtle, Basic Block: (unnamed)
 ; CHECK-NEXT:    Start Instruction:   store i32 4, ptr %4, align 4
 ; CHECK-NEXT:      End Instruction:   store i32 6, ptr %6, align 4
@@ -56,7 +56,7 @@
 ; CHECK-NEXT:  Function: dog, Basic Block: entry
 ; CHECK-NEXT:    Start Instruction:   store i32 3, ptr %3, align 4
 ; CHECK-NEXT:      End Instruction:   store i32 5, ptr %5, align 4
-; CHECK-NEXT:4 candidates of length 2.  Found in: 
+; CHECK-NEXT:4 candidates of length 2.  Found in:
 ; CHECK-NEXT:  Function: turtle, Basic Block: (unnamed)
 ; CHECK-NEXT:    Start Instruction:   store i32 5, ptr %5, align 4
 ; CHECK-NEXT:      End Instruction:   store i32 6, ptr %6, align 4
@@ -70,40 +70,40 @@
 ; CHECK-NEXT:    Start Instruction:   store i32 4, ptr %4, align 4
 ; CHECK-NEXT:      End Instruction:   store i32 5, ptr %5, align 4
 
-define linkonce_odr void @fish() {
-entry:
-  %0 = alloca i32, align 4
+define void @turtle() {
   %1 = alloca i32, align 4
   %2 = alloca i32, align 4
   %3 = alloca i32, align 4
   %4 = alloca i32, align 4
   %5 = alloca i32, align 4
-  store i32 6, ptr %0, align 4
+  %6 = alloca i32, align 4
   store i32 1, ptr %1, align 4
   store i32 2, ptr %2, align 4
   store i32 3, ptr %3, align 4
   store i32 4, ptr %4, align 4
   store i32 5, ptr %5, align 4
+  store i32 6, ptr %6, align 4
   ret void
 }
 
-define void @turtle() {
+define void @cat() {
+entry:
+  %0 = alloca i32, align 4
   %1 = alloca i32, align 4
   %2 = alloca i32, align 4
   %3 = alloca i32, align 4
   %4 = alloca i32, align 4
   %5 = alloca i32, align 4
-  %6 = alloca i32, align 4
+  store i32 6, ptr %0, align 4
   store i32 1, ptr %1, align 4
   store i32 2, ptr %2, align 4
   store i32 3, ptr %3, align 4
   store i32 4, ptr %4, align 4
   store i32 5, ptr %5, align 4
-  store i32 6, ptr %6, align 4
   ret void
 }
 
-define void @cat() {
+define linkonce_odr void @fish() {
 entry:
   %0 = alloca i32, align 4
   %1 = alloca i32, align 4
diff --git a/llvm/test/Analysis/IRSimilarityIdentifier/different.ll b/llvm/test/Analysis/IRSimilarityIdentifier/different.ll
index e5c9970b159b9f..70d422077c3e9c 100644
--- a/llvm/test/Analysis/IRSimilarityIdentifier/different.ll
+++ b/llvm/test/Analysis/IRSimilarityIdentifier/different.ll
@@ -14,11 +14,11 @@
 ; CHECK-NEXT:       End Instruction:   store i32 5, ptr %5, align 4
 ; CHECK-NEXT: 2 candidates of length 3.  Found in:
 ; CHECK-NEXT:   Function: turtle, Basic Block: (unnamed)
-; CHECK-NEXT:     Start Instruction:   %b = load i32, ptr %1, align 4
-; CHECK-NEXT:       End Instruction:   %d = load i32, ptr %3, align 4
-; CHECK-NEXT:   Function: turtle, Basic Block: (unnamed)
 ; CHECK-NEXT:     Start Instruction:   %a = load i32, ptr %0, align 4
 ; CHECK-NEXT:       End Instruction:   %c = load i32, ptr %2, align 4
+; CHECK-NEXT:   Function: turtle, Basic Block: (unnamed)
+; CHECK-NEXT:     Start Instruction:   %b = load i32, ptr %1, align 4
+; CHECK-NEXT:       End Instruction:   %d = load i32, ptr %3, align 4
 
 define linkonce_odr void @fish() {
 entry:
diff --git a/llvm/test/Transforms/IROutliner/outlining-commutative.ll b/llvm/test/Transforms/IROutliner/outlining-commutative.ll
index 8862dc295d4351..1534829bad7ba7 100644
--- a/llvm/test/Transforms/IROutliner/outlining-commutative.ll
+++ b/llvm/test/Transforms/IROutliner/outlining-commutative.ll
@@ -123,7 +123,7 @@ define void @outline_from_sub1() {
 ; CHECK-NEXT:    [[A:%.*]] = alloca i32, align 4
 ; CHECK-NEXT:    [[B:%.*]] = alloca i32, align 4
 ; CHECK-NEXT:    [[C:%.*]] = alloca i32, align 4
-; CHECK-NEXT:    call void @outlined_ir_func_2(ptr [[A]], ptr [[B]], ptr [[C]])
+; CHECK-NEXT:    call void @outlined_ir_func_1(ptr [[A]], ptr [[B]], ptr [[C]])
 ; CHECK-NEXT:    ret void
 ;
 entry:
@@ -148,7 +148,7 @@ define void @outline_from_sub2() {
 ; CHECK-NEXT:    [[A:%.*]] = alloca i32, align 4
 ; CHECK-NEXT:    [[B:%.*]] = alloca i32, align 4
 ; CHECK-NEXT:    [[C:%.*]] = alloca i32, align 4
-; CHECK-NEXT:    call void @outlined_ir_func_2(ptr [[A]], ptr [[B]], ptr [[C]])
+; CHECK-NEXT:    call void @outlined_ir_func_1(ptr [[A]], ptr [[B]], ptr [[C]])
 ; CHECK-NEXT:    ret void
 ;
 entry:
@@ -173,7 +173,7 @@ define void @dontoutline_from_flipped_sub3() {
 ; CHECK-NEXT:    [[A:%.*]] = alloca i32, align 4
 ; CHECK-NEXT:    [[B:%.*]] = alloca i32, align 4
 ; CHECK-NEXT:    [[C:%.*]] = alloca i32, align 4
-; CHECK-NEXT:    call void @outlined_ir_func_1(ptr [[A]], ptr [[B]], ptr [[C]])
+; CHECK-NEXT:    call void @outlined_ir_func_2(ptr [[A]], ptr [[B]], ptr [[C]])
 ; CHECK-NEXT:    ret void
 ;
 entry:
@@ -198,7 +198,7 @@ define void @dontoutline_from_flipped_sub4() {
 ; CHECK-NEXT:    [[A:%.*]] = alloca i32, align 4
 ; CHECK-NEXT:    [[B:%.*]] = alloca i32, align 4
 ; CHECK-NEXT:    [[C:%.*]] = alloca i32, align 4
-; CHECK-NEXT:    call void @outlined_ir_func_1(ptr [[A]], ptr [[B]], ptr [[C]])
+; CHECK-NEXT:    call void @outlined_ir_func_2(ptr [[A]], ptr [[B]], ptr [[C]])
 ; CHECK-NEXT:    ret void
 ;
 entry:
@@ -237,9 +237,9 @@ entry:
 ; CHECK-NEXT:    [[AL:%.*]] = load i32, ptr [[ARG0]], align 4
 ; CHECK-NEXT:    [[BL:%.*]] = load i32, ptr [[ARG1]], align 4
 ; CHECK-NEXT:    [[CL:%.*]] = load i32, ptr [[ARG2]], align 4
-; CHECK-NEXT:    [[TMP0:%.*]] = sub i32 [[BL]], [[AL]]
-; CHECK-NEXT:    [[TMP1:%.*]] = sub i32 [[CL]], [[AL]]
-; CHECK-NEXT:    [[TMP2:%.*]] = sub i32 [[CL]], [[BL]]
+; CHECK-NEXT:    [[TMP0:%.*]] = sub i32 [[AL]], [[BL]]
+; CHECK-NEXT:    [[TMP1:%.*]] = sub i32 [[AL]], [[CL]]
+; CHECK-NEXT:    [[TMP2:%.*]] = sub i32 [[BL]], [[CL]]
 
 ; CHECK: define internal void @outlined_ir_func_2(ptr [[ARG0:%.*]], ptr [[ARG1:%.*]], ptr [[ARG2:%.*]]) #0 {
 ; CHECK: entry_to_outline:
@@ -249,6 +249,6 @@ entry:
 ; CHECK-NEXT:    [[AL:%.*]] = load i32, ptr [[ARG0]], align 4
 ; CHECK-NEXT:    [[BL:%.*]] = load i32, ptr [[ARG1]], align 4
 ; CHECK-NEXT:    [[CL:%.*]] = load i32, ptr [[ARG2]], align 4
-; CHECK-NEXT:    [[TMP0:%.*]] = sub i32 [[AL]], [[BL]]
-; CHECK-NEXT:    [[TMP1:%.*]] = sub i32 [[AL]], [[CL]]
-; CHECK-NEXT:    [[TMP2:%.*]] = sub i32 [[BL]], [[CL]]
+; CHECK-NEXT:    [[TMP0:%.*]] = sub i32 [[BL]], [[AL]]
+; CHECK-NEXT:    [[TMP1:%.*]] = sub i32 [[CL]], [[AL]]
+; CHECK-NEXT:    [[TMP2:%.*]] = sub i32 [[CL]], [[BL]]
diff --git a/llvm/test/tools/llvm-sim/single-sim-file.test b/llvm/test/tools/llvm-sim/single-sim-file.test
index cef14b36085005..4279931f36cdf2 100644
--- a/llvm/test/tools/llvm-sim/single-sim-file.test
+++ b/llvm/test/tools/llvm-sim/single-sim-file.test
@@ -6,52 +6,52 @@
 # CHECK: {
 # CHECK-NEXT: "1": [
 # CHECK-NEXT:  {
-# CHECK-NEXT:   "start": 14,
-# CHECK-NEXT:   "end": 19
-# CHECK-NEXT:  },
-# CHECK-NEXT:  {
 # CHECK-NEXT:   "start": 4,
 # CHECK-NEXT:   "end": 9
+# CHECK-NEXT:  },
+# CHECK-NEXT:  {
+# CHECK-NEXT:   "start": 14,
+# CHECK-NEXT:   "end": 19
 # CHECK-NEXT:  }
 # CHECK-NEXT: ],
 # CHECK-NEXT: "2": [
 # CHECK-NEXT:  {
-# CHECK-NEXT:   "start": 15,
-# CHECK-NEXT:   "end": 19
-# CHECK-NEXT:  },
-# CHECK-NEXT:  {
 # CHECK-NEXT:   "start": 5,
 # CHECK-NEXT:   "end": 9
+# CHECK-NEXT:  },
+# CHECK-NEXT:  {
+# CHECK-NEXT:   "start": 15,
+# CHECK-NEXT:   "end": 19
 # CHECK-NEXT:  }
 # CHECK-NEXT: ],
 # CHECK-NEXT: "3": [
 # CHECK-NEXT:  {
-# CHECK-NEXT:   "start": 16,
-# CHECK-NEXT:   "end": 19
-# CHECK-NEXT:  },
-# CHECK-NEXT:  {
 # CHECK-NEXT:   "start": 6,
 # CHECK-NEXT:   "end": 9
+# CHECK-NEXT:  },
+# CHECK-NEXT:  {
+# CHECK-NEXT:   "start": 16,
+# CHECK-NEXT:   "end": 19
 # CHECK-NEXT:  }
 # CHECK-NEXT: ],
 # CHECK-NEXT: "4": [
 # CHECK-NEXT:  {
-# CHECK-NEXT:   "start": 17,
-# CHECK-NEXT:   "end": 19
-# CHECK-NEXT:  },
-# CHECK-NEXT:  {
 # CHECK-NEXT:   "start": 7,
 # CHECK-NEXT:   "end": 9
+# CHECK-NEXT:  },
+# CHECK-NEXT:  {
+# CHECK-NEXT:   "start": 17,
+# CHECK-NEXT:   "end": 19
 # CHECK-NEXT:  }
 # CHECK-NEXT: ],
 # CHECK-NEXT: "5": [
 # CHECK-NEXT:  {
-# CHECK-NEXT:   "start": 18,
-# CHECK-NEXT:   "end": 19
-# CHECK-NEXT:  },
-# CHECK-NEXT:  {
 # CHECK-NEXT:   "start": 8,
 # CHECK-NEXT:   "end": 9
+# CHECK-NEXT:  },
+# CHECK-NEXT:  {
+# CHECK-NEXT:   "start": 18,
+# CHECK-NEXT:   "end": 19
 # CHECK-NEXT:  }
 # CHECK-NEXT: ]
 # CHECK-NEXT:}
diff --git a/llvm/test/tools/llvm-sim/single-sim.test b/llvm/test/tools/llvm-sim/single-sim.test
index 0095ec6acbc588..3300b5cbda31a5 100644
--- a/llvm/test/tools/llvm-sim/single-sim.test
+++ b/llvm/test/tools/llvm-sim/single-sim.test
@@ -5,52 +5,52 @@
 # CHECK: {
 # CHECK-NEXT: "1": [
 # CHECK-NEXT:  {
-# CHECK-NEXT:   "start": 14,
-# CHECK-NEXT:   "end": 19
-# CHECK-NEXT:  },
-# CHECK-NEXT:  {
 # CHECK-NEXT:   "start": 4,
 # CHECK-NEXT:   "end": 9
+# CHECK-NEXT:  },
+# CHECK-NEXT:  {
+# CHECK-NEXT:   "start": 14,
+# CHECK-NEXT:   "end": 19
 # CHECK-NEXT:  }
 # CHECK-NEXT: ],
 # CHECK-NEXT: "2": [
 # CHECK-NEXT:  {
-# CHECK-NEXT:   "start": 15,
-# CHECK-NEXT:   "end": 19
-# CHECK-NEXT:  },
-# CHECK-NEXT:  {
 # CHECK-NEXT:   "start": 5,
 # CHECK-NEXT:   "end": 9
+# CHECK-NEXT:  },
+# CHECK-NEXT:  {
+# CHECK-NEXT:   "start": 15,
+# CHECK-NEXT:   "end": 19
 # CHECK-NEXT:  }
 # CHECK-NEXT: ],
 # CHECK-NEXT: "3": [
 # CHECK-NEXT:  {
-# CHECK-NEXT:   "start": 16,
-# CHECK-NEXT:   "end": 19
-# CHECK-NEXT:  },
-# CHECK-NEXT:  {
 # CHECK-NEXT:   "start": 6,
 # CHECK-NEXT:   "end": 9
+# CHECK-NEXT:  },
+# CHECK-NEXT:  {
+# CHECK-NEXT:   "start": 16,
+# CHECK-NEXT:   "end": 19
 # CHECK-NEXT:  }
 # CHECK-NEXT: ],
 # CHECK-NEXT: "4": [
 # CHECK-NEXT:  {
-# CHECK-NEXT:   "start": 17,
-# CHECK-NEXT:   "end": 19
-# CHECK-NEXT:  },
-# CHECK-NEXT:  {
 # CHECK-NEXT:   "start": 7,
 # CHECK-NEXT:   "end": 9
+# CHECK-NEXT:  },
+# CHECK-NEXT:  {
+# CHECK-NEXT:   "start": 17,
+# CHECK-NEXT:   "end": 19
 # CHECK-NEXT:  }
 # CHECK-NEXT: ],
 # CHECK-NEXT: "5": [
 # CHECK-NEXT:  {
-# CHECK-NEXT:   "start": 18,
-# CHECK-NEXT:   "end": 19
-# CHECK-NEXT:  },
-# CHECK-NEXT:  {
 # CHECK-NEXT:   "start": 8,
 # CHECK-NEXT:   "end": 9
+# CHECK-NEXT:  },
+# CHECK-NEXT:  {
+# CHECK-NEXT:   "start": 18,
+# CHECK-NEXT:   "end": 19
 # CHECK-NEXT:  }
 # CHECK-NEXT: ]
 # CHECK-NEXT:}

>From 010bc6e9596844a69ba1ddf60ff0174d1d3c4bc3 Mon Sep 17 00:00:00 2001
From: Xuan Zhang <xuanzh at meta.com>
Date: Fri, 12 Apr 2024 10:55:13 -0700
Subject: [PATCH 2/2] outlining order based on priority instead of benefits

---
 llvm/lib/CodeGen/MachineOutliner.cpp          | 10 +-
 .../machine-outliner-sort-per-priority.ll     | 96 +++++++++++++++++++
 2 files changed, 104 insertions(+), 2 deletions(-)
 create mode 100644 llvm/test/CodeGen/AArch64/machine-outliner-sort-per-priority.ll

diff --git a/llvm/lib/CodeGen/MachineOutliner.cpp b/llvm/lib/CodeGen/MachineOutliner.cpp
index e682d42c76747e..341c94e7adf2ee 100644
--- a/llvm/lib/CodeGen/MachineOutliner.cpp
+++ b/llvm/lib/CodeGen/MachineOutliner.cpp
@@ -825,10 +825,16 @@ bool MachineOutliner::outline(Module &M,
                     << "\n");
   bool OutlinedSomething = false;
 
-  // Sort by benefit. The most beneficial functions should be outlined first.
+  // Sort by priority where priority := getNotOutlinedCost / getOutliningCost.
+  // The function with highest priority should be outlined first.
   stable_sort(FunctionList,
               [](const OutlinedFunction &LHS, const OutlinedFunction &RHS) {
-                return LHS.getBenefit() > RHS.getBenefit();
+                if (LHS.getBenefit() == 0)
+                  return false;
+                if (LHS.getBenefit() > 0 && RHS.getBenefit() == 0)
+                  return true;
+                return LHS.getNotOutlinedCost() * RHS.getOutliningCost() >
+                       RHS.getNotOutlinedCost() * LHS.getOutliningCost();
               });
 
   // Walk over each function, outlining them as we go along. Functions are
diff --git a/llvm/test/CodeGen/AArch64/machine-outliner-sort-per-priority.ll b/llvm/test/CodeGen/AArch64/machine-outliner-sort-per-priority.ll
new file mode 100644
index 00000000000000..00efc3c6e71c89
--- /dev/null
+++ b/llvm/test/CodeGen/AArch64/machine-outliner-sort-per-priority.ll
@@ -0,0 +1,96 @@
+; This tests the order in which functions are outlined in MachineOutliner
+; There are TWO key OutlinedFunction in FunctionList
+;
+; ===================== First One =====================
+;   ```
+;     mov     w0, #1
+;     mov     w1, #2
+;     mov     w2, #3
+;     mov     w3, #4
+;     mov     w4, #5
+;   ```
+; It has:
+;   - `SequenceSize=20` and `OccurrenceCount=6`
+;   - each Candidate has `CallOverhead=12` and `FrameOverhead=4`
+;   - `NotOutlinedCost=20*6=120` and `OutliningCost=12*6+20+4=96`
+;   - `Benefit=120-96=24` and `Priority=120/96=1.25`
+;
+; ===================== Second One =====================
+;   ```
+;     mov     w6, #6
+;     mov     w7, #7
+;     b
+;   ```
+; It has:
+;   - `SequenceSize=12` and `OccurrenceCount=4`
+;   - each Candidate has `CallOverhead=4` and `FrameOverhead=0`
+;   - `NotOutlinedCost=12*4=48` and `OutliningCost=4*4+12+0=28`
+;   - `Benefit=120-96=20` and `Priority=48/28=1.71`
+;
+; Note that the first one has higher benefit, but lower priority.
+; Hence, when outlining per priority, the second one will be outlined first.
+
+; RUN: llc %s -enable-machine-outliner=always -filetype=obj -o %t
+; RUN: llvm-objdump -d %t | FileCheck %s --check-prefix=CHECK-SORT-BY-PRIORITY
+
+; RUN: llc %s -enable-machine-outliner=always -outliner-benefit-threshold=22 -filetype=obj -o %t
+; RUN: llvm-objdump -d %t | FileCheck %s --check-prefix=CHECK-THRESHOLD
+
+
+target datalayout = "e-m:o-i64:64-i128:128-n32:64-S128"
+target triple = "arm64-apple-macosx14.0.0"
+
+declare i32 @_Z3fooiiii(i32 noundef, i32 noundef, i32 noundef, i32 noundef, i32 noundef, i32 noundef, i32 noundef, i32 noundef)
+
+define i32 @_Z2f1v() minsize {
+  %1 = tail call i32 @_Z3fooiiii(i32 noundef 1, i32 noundef 2, i32 noundef 3, i32 noundef 4, i32 noundef 5, i32 noundef 11, i32 noundef 6, i32 noundef 7)
+  ret i32 %1
+}
+
+define i32 @_Z2f2v() minsize {
+  %1 = tail call i32 @_Z3fooiiii(i32 noundef 1, i32 noundef 2, i32 noundef 3, i32 noundef 4, i32 noundef 5, i32 noundef 12, i32 noundef 6, i32 noundef 7)
+  ret i32 %1
+}
+
+define i32 @_Z2f3v() minsize {
+  %1 = tail call i32 @_Z3fooiiii(i32 noundef 1, i32 noundef 2, i32 noundef 3, i32 noundef 4, i32 noundef 5, i32 noundef 13, i32 noundef 6, i32 noundef 7)
+  ret i32 %1
+}
+
+define i32 @_Z2f4v() minsize {
+  %1 = tail call i32 @_Z3fooiiii(i32 noundef 1, i32 noundef 2, i32 noundef 3, i32 noundef 4, i32 noundef 5, i32 noundef 14, i32 noundef 6, i32 noundef 7)
+  ret i32 %1
+}
+
+define i32 @_Z2f5v() minsize {
+  %1 = tail call i32 @_Z3fooiiii(i32 noundef 1, i32 noundef 2, i32 noundef 3, i32 noundef 4, i32 noundef 5, i32 noundef 15, i32 noundef 8, i32 noundef 9)
+  ret i32 %1
+}
+
+define i32 @_Z2f6v() minsize {
+  %1 = tail call i32 @_Z3fooiiii(i32 noundef 1, i32 noundef 2, i32 noundef 3, i32 noundef 4, i32 noundef 5, i32 noundef 16, i32 noundef 9, i32 noundef 8)
+  ret i32 %1
+}
+
+; CHECK-SORT-BY-PRIORITY: <_OUTLINED_FUNCTION_0>:
+; CHECK-SORT-BY-PRIORITY-NEXT: mov     w6, #0x6
+; CHECK-SORT-BY-PRIORITY-NEXT: mov     w7, #0x7
+; CHECK-SORT-BY-PRIORITY-NEXT: b
+
+; CHECK-SORT-BY-PRIORITY: <_OUTLINED_FUNCTION_1>:
+; CHECK-SORT-BY-PRIORITY-NEXT: mov     w0, #0x1
+; CHECK-SORT-BY-PRIORITY-NEXT: mov     w1, #0x2
+; CHECK-SORT-BY-PRIORITY-NEXT: mov     w2, #0x3
+; CHECK-SORT-BY-PRIORITY-NEXT: mov     w3, #0x4
+; CHECK-SORT-BY-PRIORITY-NEXT: mov     w4, #0x5
+; CHECK-SORT-BY-PRIORITY-NEXT: ret
+
+; CHECK-THRESHOLD: <_OUTLINED_FUNCTION_0>:
+; CHECK-THRESHOLD-NEXT: mov     w0, #0x1
+; CHECK-THRESHOLD-NEXT: mov     w1, #0x2
+; CHECK-THRESHOLD-NEXT: mov     w2, #0x3
+; CHECK-THRESHOLD-NEXT: mov     w3, #0x4
+; CHECK-THRESHOLD-NEXT: mov     w4, #0x5
+; CHECK-THRESHOLD-NEXT: ret
+
+; CHECK-THRESHOLD-NOT: <_OUTLINED_FUNCTION_1>:



More information about the llvm-commits mailing list