[llvm] [BOLT][AArch64] Introduce SPE mode in BasicAggregation (PR #120741)

Paschalis Mpeis via llvm-commits llvm-commits at lists.llvm.org
Wed Feb 5 08:04:53 PST 2025


================
@@ -1703,6 +1786,46 @@ std::error_code DataAggregator::parseBasicEvents() {
   return std::error_code();
 }
 
+std::error_code DataAggregator::parseSpeAsBasicEvents() {
+  outs() << "PERF2BOLT: parsing SPE data as basic events (no LBR)...\n";
+  NamedRegionTimer T("parseSPEBasic", "Parsing SPE as basic events",
+                     TimerGroupName, TimerGroupDesc, opts::TimeAggregator);
+  uint64_t NumSpeBranchSamples = 0;
+
+  // Convert entries to one or two basic samples, depending on whether there is
+  // branch target information.
+  while (hasData()) {
+    auto SamplePair = parseSpeAsBasicSamples();
+    if (std::error_code EC = SamplePair.getError())
+      return EC;
+
+    auto registerSample = [this](const PerfBasicSample *Sample) {
+      if (!Sample->PC)
+        return;
+
+      if (BinaryFunction *BF = getBinaryFunctionContainingAddress(Sample->PC))
+        BF->setHasProfileAvailable();
+
+      ++BasicSamples[Sample->PC];
+      EventNames.insert(Sample->EventName);
+    };
+
+    if (SamplePair->first.PC != 0x0 && SamplePair->second.PC != 0x0)
+      ++NumSpeBranchSamples;
+
+    registerSample(&SamplePair->first);
+    registerSample(&SamplePair->second);
----------------
paschalis-mpeis wrote:

Hey Pavel,

Reading this back, you are concerned whether storing samples on TGT branches that are not NOT-TAKEN might increase hotness in a block that it shouldn't have. Correct?

That should not be a concern, as regardless of whether a branch is taken or not, the reported `TGT` is what was architecturally executed. In other words, `NOT-TAKEN` (or it's absence) characterizes what had happen in the src branch (`PC`), while `TGT` will always point to the path we end up taking.

So, for fall-through SPE packets, the `TGT` address would always be the next address from `PC` (ie, `0xA00` + `4`, which is the instruction size in AArch64):  
```
PC 0xA00
B COND
EV RETIRED NOT-TAKEN
TGT 0xA04
```

For taken branches, the `TGT` can be at a distance further than just `4` :  
```
PC 0xA00
B COND
EV RETIRED
TGT 0xBBB
```

In my previous examples I was using mock addresses for PC/TGT, so I've updated any relevant examples to avoid confusion.

https://github.com/llvm/llvm-project/pull/120741


More information about the llvm-commits mailing list