[llvm] [BOLT] Skip the perf2bolt step on AArch64 (PR #112070)

via llvm-commits llvm-commits at lists.llvm.org
Fri Oct 11 20:14:30 PDT 2024


llvmbot wrote:


<!--LLVM PR SUMMARY COMMENT-->

@llvm/pr-subscribers-bolt

Author: Wenlong Mu (onroadmuwl)

<details>
<summary>Changes</summary>

To reduce the optimization time of BOLT on AArch64, I attempt to use the ` -p perf.data -nl `option in `llvm-bolt` directly. However, the output indicates that the target binary isn’t optimized by BOLT on AArch64 platform, as seen below:

>                    0 : executed forward branches
>                    0 : taken forward branches
>                    0 : executed backward branches
>                    0 : taken backward branches
>                    0 : executed unconditional branches
>                    0 : all function calls
>                    0 : indirect calls
>                    0 : PLT calls
>                    0 : executed instructions
>                    0 : executed load instructions
>                    0 : executed store instructions
>                    0 : taken jump table branches
>                    0 : taken unknown indirect branches
>                    0 : total branches
>                    0 : taken branches
>                    0 : non-taken conditional branches
>                    0 : taken conditional branches
>                    0 : all conditional branches
>                    0 : linker-inserted veneer calls
By further analyzing the reason, I resolve the issue by associating  `BinaryFunction` with  `SampleData`. Additionally, to prevent incorrect mapping of samples to basic blocks, I ensured that the samples are sorted before being processed.
With these changes, the output is now consistent with that obtained using the ` -b perf.fdata` option, ensuring successful optimization.

>             77756297 : executed forward branches
>             45234387 : taken forward branches
>             21542072 : executed backward branches
>              9313428 : taken backward branches
>              9655158 : executed unconditional branches
>             24234134 : all function calls
>              7322769 : indirect calls
>              1489067 : PLT calls
>            668165930 : executed instructions
>            157103847 : executed load instructions
>                    0 : executed store instructions
>                    0 : taken jump table branches
>                    0 : taken unknown indirect branches
>            108953527 : total branches
>             64202973 : taken branches
>             44750554 : non-taken conditional branches
>             54547815 : taken conditional branches
>             99298369 : all conditional branches
>                    0 : linker-inserted veneer calls
> 
>             90228252 : executed forward branches (+16.0%)
>              3531918 : taken forward branches (-92.2%)
>              9070117 : executed backward branches (-57.9%)
>              3417336 : taken backward branches (-63.3%)
>              2902049 : executed unconditional branches (-69.9%)
>             24234134 : all function calls (=)
>              7322769 : indirect calls (=)
>              1489067 : PLT calls (=)
>            662605907 : executed instructions (-0.8%)
>            157103847 : executed load instructions (=)
>                    0 : executed store instructions (=)
>                    0 : taken jump table branches (=)
>                    0 : taken unknown indirect branches (=)
>            102200418 : total branches (-6.2%)
>              9851303 : taken branches (-84.7%)
>             92349115 : non-taken conditional branches (+106.4%)
>              6949254 : taken conditional branches (-87.3%)
>             99298369 : all conditional branches (=)
>                    0 : linker-inserted veneer calls (=)
Thank you for considering this PR. I look forward to any feedback you may have.


---
Full diff: https://github.com/llvm/llvm-project/pull/112070.diff


1 Files Affected:

- (modified) bolt/lib/Profile/DataAggregator.cpp (+8) 


``````````diff
diff --git a/bolt/lib/Profile/DataAggregator.cpp b/bolt/lib/Profile/DataAggregator.cpp
index 0a63148379d900..3bfa65c824c8a5 100644
--- a/bolt/lib/Profile/DataAggregator.cpp
+++ b/bolt/lib/Profile/DataAggregator.cpp
@@ -599,6 +599,11 @@ Error DataAggregator::readProfile(BinaryContext &BC) {
     convertBranchData(Function);
   }
 
+  for (auto &BFI : BC.getBinaryFunctions()) {
+    BinaryFunction &BF = BFI.second;
+    readSampleData(BF);
+  }
+
   if (opts::AggregateOnly) {
     if (opts::ProfileFormat == opts::ProfileFormatKind::PF_Fdata)
       if (std::error_code EC = writeAggregatedFile(opts::OutputFilename))
@@ -645,6 +650,9 @@ void DataAggregator::processProfile(BinaryContext &BC) {
   for (auto &FuncBranches : NamesToBranches)
     llvm::stable_sort(FuncBranches.second.Data);
 
+  for (auto &FuncSamples : NamesToSamples)
+    llvm::stable_sort(FuncSamples.second.Data);
+
   for (auto &MemEvents : NamesToMemEvents)
     llvm::stable_sort(MemEvents.second.Data);
 

``````````

</details>


https://github.com/llvm/llvm-project/pull/112070


More information about the llvm-commits mailing list