[llvm] [BOLT][AArch64] Introduce SPE mode in BasicAggregation (PR #120741)
Paschalis Mpeis via llvm-commits
llvm-commits at lists.llvm.org
Wed Feb 5 08:04:53 PST 2025
================
@@ -1703,6 +1786,46 @@ std::error_code DataAggregator::parseBasicEvents() {
return std::error_code();
}
+std::error_code DataAggregator::parseSpeAsBasicEvents() {
+ outs() << "PERF2BOLT: parsing SPE data as basic events (no LBR)...\n";
+ NamedRegionTimer T("parseSPEBasic", "Parsing SPE as basic events",
+ TimerGroupName, TimerGroupDesc, opts::TimeAggregator);
+ uint64_t NumSpeBranchSamples = 0;
+
+ // Convert entries to one or two basic samples, depending on whether there is
+ // branch target information.
+ while (hasData()) {
+ auto SamplePair = parseSpeAsBasicSamples();
+ if (std::error_code EC = SamplePair.getError())
+ return EC;
+
+ auto registerSample = [this](const PerfBasicSample *Sample) {
+ if (!Sample->PC)
+ return;
+
+ if (BinaryFunction *BF = getBinaryFunctionContainingAddress(Sample->PC))
+ BF->setHasProfileAvailable();
+
+ ++BasicSamples[Sample->PC];
+ EventNames.insert(Sample->EventName);
+ };
+
+ if (SamplePair->first.PC != 0x0 && SamplePair->second.PC != 0x0)
+ ++NumSpeBranchSamples;
+
+ registerSample(&SamplePair->first);
+ registerSample(&SamplePair->second);
----------------
paschalis-mpeis wrote:
Hey Pavel,
Reading this back, you are concerned whether storing samples on TGT branches that are not NOT-TAKEN might increase hotness in a block that it shouldn't have. Correct?
That should not be a concern, as regardless of whether a branch is taken or not, the reported `TGT` is what was architecturally executed. In other words, `NOT-TAKEN` (or it's absence) characterizes what had happen in the src branch (`PC`), while `TGT` will always point to the path we end up taking.
So, for fall-through SPE packets, the `TGT` address would always be the next address from `PC` (ie, `0xA00` + `4`, which is the instruction size in AArch64):
```
PC 0xA00
B COND
EV RETIRED NOT-TAKEN
TGT 0xA04
```
For taken branches, the `TGT` can be at a distance further than just `4` :
```
PC 0xA00
B COND
EV RETIRED
TGT 0xBBB
```
In my previous examples I was using mock addresses for PC/TGT, so I've updated any relevant examples to avoid confusion.
https://github.com/llvm/llvm-project/pull/120741
More information about the llvm-commits
mailing list