[llvm] [BOLT] Adding a unittest that covers Arm SPE PBT aggregation (PR #160095)
    Ádám Kallai via llvm-commits 
    llvm-commits at lists.llvm.org
       
    Mon Oct 20 04:12:18 PDT 2025
    
    
  
https://github.com/kaadam updated https://github.com/llvm/llvm-project/pull/160095
>From 5baecf83f0cfa61b0429c158375cc47c6ea48634 Mon Sep 17 00:00:00 2001
From: Adam Kallai <kadam at inf.u-szeged.hu>
Date: Tue, 16 Sep 2025 11:09:41 +0200
Subject: [PATCH 1/3] [BOLT] Add unit test for SPE PBT feature
When the SPE previous branch target address (named as PBT) feature is
available, an SPE sample by combining this PBT feature, has two entries.
Arm SPE records SRC/DEST addresses of the latest sampled branch operation,
and it stores into the first entry.
PBT records the target address of most recently taken branch in program order
before the sampled operation, it places into the second entry.
They are formed a chain of two consecutive branches.
Where:
- The previous branch operation (PBT) is always taken.
- In SPE entry, the current source branch (SRC) may be either fall-through or taken.
- The target address (DEST) of the recorded branch operation is always
  what was architecturally executed.
However PBT doesn't provide as much information as SPE does. It lacks those information
such as the address of source branch, branch type, and prediction bit.
These information are always filled with zero in PBT entry.
Therefore Bolt cannot evaluate the prediction, and source branch fields,
it leaves them zero during the aggregation process.
Consider the following example to see how SPE profile looks like combining with PBT:
`<PID> <SRC>/<DEST>/PN/-/-/10/COND/- <NULL>/<PBT>/-/-/-/0//-
0xffff8000807216b4/0xffff800080721704/P/-/-/1/COND/-  0x0/0xffff8000807216ac/-/-/-/0//-`
---
 bolt/unittests/Profile/PerfSpeEvents.cpp | 72 ++++++++++++++++++++++++
 1 file changed, 72 insertions(+)
diff --git a/bolt/unittests/Profile/PerfSpeEvents.cpp b/bolt/unittests/Profile/PerfSpeEvents.cpp
index 8d023cd7b7e74..4407f4f494206 100644
--- a/bolt/unittests/Profile/PerfSpeEvents.cpp
+++ b/bolt/unittests/Profile/PerfSpeEvents.cpp
@@ -161,4 +161,76 @@ TEST_F(PerfSpeEventsTestHelper, SpeBranchesWithBrstack) {
   parseAndCheckBrstackEvents(1234, ExpectedSamples);
 }
 
+TEST_F(PerfSpeEventsTestHelper, SpeBranchesWithBrstackAndPbt) {
+  // Check perf input with SPE branch events as brstack format by
+  // combining with the previous branch target address (named as PBT).
+  // Example collection command:
+  // ```
+  // perf record -e 'arm_spe_0/branch_filter=1/u' -- BINARY
+  // ```
+  // How Bolt extracts the branch events:
+  // ```
+  // perf script -F pid,brstack --itrace=bl
+  // ```
+
+  opts::ArmSPE = true;
+  opts::ReadPerfEvents =
+      // "<PID> <SRC>/<DEST>/PN/-/-/10/COND/- <NULL>/<PBT>/-/-/-/0//-\n"
+      "  4567  0xa002/0xa003/PN/-/-/10/COND/- 0x0/0xa001/-/-/-/0//-\n"
+      "  4567  0xb002/0xb003/P/-/-/4/RET/- 0x0/0xb001/-/-/-/0//-\n"
+      "  4567  0xc456/0xc789/P/-/-/13/-/- 0x0/0xc123/-/-/-/0//-\n"
+      "  4567  0xd456/0xd789/M/-/-/7/RET/- 0x0/0xd123/-/-/-/0//-\n"
+      "  4567  0xe005/0xe009/P/-/-/14/RET/- 0x0/0xe001/-/-/-/0//-\n"
+      "  4567  0xd456/0xd789/M/-/-/7/RET/- 0x0/0xd123/-/-/-/0//-\n"
+      "  4567  0xf002/0xf003/MN/-/-/8/COND/- 0x0/0xf001/-/-/-/0//-\n"
+      "  4567  0xc456/0xc789/P/-/-/13/-/- 0x0/0xc123/-/-/-/0//-\n";
+
+  // ExpectedSamples contains the aggregated information about
+  // a branch {{From, To, TraceTo}, {TakenCount, MispredCount}}.
+  // When the SPE previous branch target address (named as PBT)
+  // feature is available, an SPE sample by combining this PBT feature,
+  // has two entries.
+  // Arm SPE records SRC/DEST addresses of the latest sampled branch operation,
+  // and it stores into the first entry. PBT records the target address of
+  // most recently taken branch in program order before the sampled operation,
+  // it places into the second entry.
+  // They are formed a chain of two consecutive branches.
+  // Where:
+  //   - The previous branch operation (PBT) is always taken.
+  //   - In SPE entry, the current source branch (SRC) may be either
+  //     fall-through or taken.
+  //   - The target address (DEST) of the recorded
+  //     branch operation is always what was architecturally executed.
+  // However PBT lacks associated information such as branch
+  // source address, branch type, and prediction bit.
+  // Considering this Trace pair:
+  //  {{0xd456, 0xd789, Trace::BR_ONLY}, {2, 2}},
+  //    {{0x0, 0xd123, 0xd456}, {2, 0}}
+  // For SPE trace please see the description above.
+  // The second entry is the PBT trace:
+  // {{0x0, 0xd123, 0xd456}, {2, 0}}.
+  // The PBT entry has a TakenCount = 2, as we have two samples for
+  // (0x0, 0xd123) entry in our input. The 'MispredsCount = 0' is
+  // always zero, because it lacks prediction information.
+  // It also has no information about source branch address therefore
+  // Bolt doesn't evaluate the 'From' field, and leaves it as zero (0x0).
+  // TraceTo = 0xc456, means the execution jumped from
+  // 0xc123 (PBT) to 0xc456 (SRC), and jumped further to 0xd789 (DEST).
+  std::vector<std::pair<Trace, TakenBranchInfo>> ExpectedSamples = {
+      {{0xa002, 0xa003, Trace::BR_ONLY}, {1, 0}},
+      {{0x0, 0xa001, 0xa002}, {1, 0}},
+      {{0xb002, 0xb003, Trace::BR_ONLY}, {1, 0}},
+      {{0x0, 0xb001, 0xb002}, {1, 0}},
+      {{0xc456, 0xc789, Trace::BR_ONLY}, {2, 0}},
+      {{0x0, 0xc123, 0xc456}, {2, 0}},
+      {{0xd456, 0xd789, Trace::BR_ONLY}, {2, 2}},
+      {{0x0, 0xd123, 0xd456}, {2, 0}},
+      {{0xe005, 0xe009, Trace::BR_ONLY}, {1, 0}},
+      {{0x0, 0xe001, 0xe005}, {1, 0}},
+      {{0xf002, 0xf003, Trace::BR_ONLY}, {1, 1}},
+      {{0x0, 0xf001, 0xf002}, {1, 0}}};
+
+  parseAndCheckBrstackEvents(4567, ExpectedSamples);
+}
+
 #endif
>From 666ff2fd27938c0ee2ce203749d9f25dadcf5814 Mon Sep 17 00:00:00 2001
From: Adam Kallai <kadam at inf.u-szeged.hu>
Date: Mon, 20 Oct 2025 12:48:49 +0200
Subject: [PATCH 2/3] Description is updated
---
 bolt/unittests/Profile/PerfSpeEvents.cpp | 69 ++++++++++++++----------
 1 file changed, 40 insertions(+), 29 deletions(-)
diff --git a/bolt/unittests/Profile/PerfSpeEvents.cpp b/bolt/unittests/Profile/PerfSpeEvents.cpp
index 4407f4f494206..a0705c00cff1c 100644
--- a/bolt/unittests/Profile/PerfSpeEvents.cpp
+++ b/bolt/unittests/Profile/PerfSpeEvents.cpp
@@ -187,35 +187,46 @@ TEST_F(PerfSpeEventsTestHelper, SpeBranchesWithBrstackAndPbt) {
 
   // ExpectedSamples contains the aggregated information about
   // a branch {{From, To, TraceTo}, {TakenCount, MispredCount}}.
-  // When the SPE previous branch target address (named as PBT)
-  // feature is available, an SPE sample by combining this PBT feature,
-  // has two entries.
-  // Arm SPE records SRC/DEST addresses of the latest sampled branch operation,
-  // and it stores into the first entry. PBT records the target address of
-  // most recently taken branch in program order before the sampled operation,
-  // it places into the second entry.
-  // They are formed a chain of two consecutive branches.
-  // Where:
-  //   - The previous branch operation (PBT) is always taken.
-  //   - In SPE entry, the current source branch (SRC) may be either
-  //     fall-through or taken.
-  //   - The target address (DEST) of the recorded
-  //     branch operation is always what was architecturally executed.
-  // However PBT lacks associated information such as branch
-  // source address, branch type, and prediction bit.
-  // Considering this Trace pair:
-  //  {{0xd456, 0xd789, Trace::BR_ONLY}, {2, 2}},
-  //    {{0x0, 0xd123, 0xd456}, {2, 0}}
-  // For SPE trace please see the description above.
-  // The second entry is the PBT trace:
-  // {{0x0, 0xd123, 0xd456}, {2, 0}}.
-  // The PBT entry has a TakenCount = 2, as we have two samples for
-  // (0x0, 0xd123) entry in our input. The 'MispredsCount = 0' is
-  // always zero, because it lacks prediction information.
-  // It also has no information about source branch address therefore
-  // Bolt doesn't evaluate the 'From' field, and leaves it as zero (0x0).
-  // TraceTo = 0xc456, means the execution jumped from
-  // 0xc123 (PBT) to 0xc456 (SRC), and jumped further to 0xd789 (DEST).
+  // Where
+  // - From: is the source address of the sampled branch operation.
+  // - To: is the target address of the sampled branch operation.
+  // - TraceTo could be either
+  //    - A 'Type = Trace::BR_ONLY', which means the trace only contains branch data.
+  //    - Or an address, when the trace contains information about the previous branch.
+  //
+  // When FEAT_SPE_PBT is present, Arm SPE emits two records per sample:
+  // - the current branch (Spe.From/Spe.To), and
+  // - the previous taken branch target (PBT) (PBT.From, PBT.To).
+  //
+  // Together they behave like a depth-1 branch stack where:
+  //   - the PBT entry is always taken
+  //   - the current branch entry may represent a taken branch or a fall-through
+  //   - the destination (Spe.To) is the architecturally executed target
+  //
+  // There can be fall-throughs to be inferred between the PBT entry and
+  // the current branch (Spe.From), but there cannot be between current
+  // branch's (Spe.From/Spe.To).
+  //
+  // PBT records only the target address (PBT.To), meaning we have no information as the
+  // branch source (PBT.From=0x0), branch type, and the prediction bit.
+  //
+  // Consider the trace pair:
+  // {{Spe.From, Spe.To, Type}, {TK, MP}}, {{PBT.From, PBT.To, TraceTo}, {TK, MP}}
+  // {{0xd456, 0xd789, Trace::BR_ONLY}, {2, 2}}, {{0x0, 0xd123, 0xd456}, {2, 0}}
+  //
+  // The first entry is the Spe record, which represents a trace from 0xd456 (Spe.From) to
+  // 0xd789 (Spe.To). Type = Trace::BR_ONLY, as Bolt processes the current branch event first.
+  // At this point we have no information about the previous trace (PBT).
+  // This entry has a TakenCount = 2, as we have two samples for (0xd456, 0xd789)
+  // in our input. It also has MispredsCount = 2, as 'M' misprediction flag appears
+  // in both cases.
+  //
+  // The second entry is the PBT record. TakenCount = 2 because the
+  // (PBT.From = 0x0, PBT.To = 0xd123) branch target appears twice in the input,
+  // and MispredsCount = 0 because prediction data is absent. There is no branch
+  // source information, so the PBT.From field is zero (0x0). TraceTo = 0xd456
+  // connect the flow to the previous taken branch at 0xd123 (PBT.To) to the current
+  // source branch at 0xd456 (Spe.From), which then continues to 0xd789 (Spe.To).
   std::vector<std::pair<Trace, TakenBranchInfo>> ExpectedSamples = {
       {{0xa002, 0xa003, Trace::BR_ONLY}, {1, 0}},
       {{0x0, 0xa001, 0xa002}, {1, 0}},
>From 9958168082414d497c53521bc1c0a422f2ebf6ae Mon Sep 17 00:00:00 2001
From: Adam Kallai <kadam at inf.u-szeged.hu>
Date: Mon, 20 Oct 2025 12:59:03 +0200
Subject: [PATCH 3/3] Fix the format
---
 bolt/unittests/Profile/PerfSpeEvents.cpp | 31 ++++++++++++++----------
 1 file changed, 18 insertions(+), 13 deletions(-)
diff --git a/bolt/unittests/Profile/PerfSpeEvents.cpp b/bolt/unittests/Profile/PerfSpeEvents.cpp
index a0705c00cff1c..4f060cd0aa7c8 100644
--- a/bolt/unittests/Profile/PerfSpeEvents.cpp
+++ b/bolt/unittests/Profile/PerfSpeEvents.cpp
@@ -191,8 +191,10 @@ TEST_F(PerfSpeEventsTestHelper, SpeBranchesWithBrstackAndPbt) {
   // - From: is the source address of the sampled branch operation.
   // - To: is the target address of the sampled branch operation.
   // - TraceTo could be either
-  //    - A 'Type = Trace::BR_ONLY', which means the trace only contains branch data.
-  //    - Or an address, when the trace contains information about the previous branch.
+  //    - A 'Type = Trace::BR_ONLY', which means the trace only contains branch
+  //    data.
+  //    - Or an address, when the trace contains information about the previous
+  //    branch.
   //
   // When FEAT_SPE_PBT is present, Arm SPE emits two records per sample:
   // - the current branch (Spe.From/Spe.To), and
@@ -207,26 +209,29 @@ TEST_F(PerfSpeEventsTestHelper, SpeBranchesWithBrstackAndPbt) {
   // the current branch (Spe.From), but there cannot be between current
   // branch's (Spe.From/Spe.To).
   //
-  // PBT records only the target address (PBT.To), meaning we have no information as the
-  // branch source (PBT.From=0x0), branch type, and the prediction bit.
+  // PBT records only the target address (PBT.To), meaning we have no
+  // information as the branch source (PBT.From=0x0), branch type, and the
+  // prediction bit.
   //
   // Consider the trace pair:
-  // {{Spe.From, Spe.To, Type}, {TK, MP}}, {{PBT.From, PBT.To, TraceTo}, {TK, MP}}
+  // {{Spe.From, Spe.To, Type}, {TK, MP}},
+  //   {{PBT.From, PBT.To, TraceTo}, {TK, MP}}
   // {{0xd456, 0xd789, Trace::BR_ONLY}, {2, 2}}, {{0x0, 0xd123, 0xd456}, {2, 0}}
   //
-  // The first entry is the Spe record, which represents a trace from 0xd456 (Spe.From) to
-  // 0xd789 (Spe.To). Type = Trace::BR_ONLY, as Bolt processes the current branch event first.
-  // At this point we have no information about the previous trace (PBT).
-  // This entry has a TakenCount = 2, as we have two samples for (0xd456, 0xd789)
-  // in our input. It also has MispredsCount = 2, as 'M' misprediction flag appears
-  // in both cases.
+  // The first entry is the Spe record, which represents a trace from 0xd456
+  // (Spe.From) to 0xd789 (Spe.To). Type = Trace::BR_ONLY, as Bolt processes the
+  // current branch event first. At this point we have no information about the
+  // previous trace (PBT). This entry has a TakenCount = 2, as we have two
+  // samples for (0xd456, 0xd789) in our input. It also has MispredsCount = 2,
+  // as 'M' misprediction flag appears in both cases.
   //
   // The second entry is the PBT record. TakenCount = 2 because the
   // (PBT.From = 0x0, PBT.To = 0xd123) branch target appears twice in the input,
   // and MispredsCount = 0 because prediction data is absent. There is no branch
   // source information, so the PBT.From field is zero (0x0). TraceTo = 0xd456
-  // connect the flow to the previous taken branch at 0xd123 (PBT.To) to the current
-  // source branch at 0xd456 (Spe.From), which then continues to 0xd789 (Spe.To).
+  // connect the flow from the previous taken branch at 0xd123 (PBT.To) to the
+  // current source branch at 0xd456 (Spe.From), which then continues to 0xd789
+  // (Spe.To).
   std::vector<std::pair<Trace, TakenBranchInfo>> ExpectedSamples = {
       {{0xa002, 0xa003, Trace::BR_ONLY}, {1, 0}},
       {{0x0, 0xa001, 0xa002}, {1, 0}},
    
    
More information about the llvm-commits
mailing list