[all-commits] [llvm/llvm-project] 708506: [BOLT] Support pre-aggregated returns (#143296)
Amir Ayupov via All-commits
all-commits at lists.llvm.org
Fri Jun 20 03:17:30 PDT 2025
Branch: refs/heads/main
Home: https://github.com/llvm/llvm-project
Commit: 7085065c02da6091dca91be201160912e43a63ec
https://github.com/llvm/llvm-project/commit/7085065c02da6091dca91be201160912e43a63ec
Author: Amir Ayupov <aaupov at fb.com>
Date: 2025-06-20 (Fri, 20 Jun 2025)
Changed paths:
M bolt/include/bolt/Profile/DataAggregator.h
M bolt/lib/Profile/DataAggregator.cpp
M bolt/test/X86/callcont-fallthru.s
M bolt/test/link_fdata.py
Log Message:
-----------
[BOLT] Support pre-aggregated returns (#143296)
Intel's Architectural LBR supports capturing branch type information
as part of LBR stack (SDM Vol 3B, part 2, October 2024):
```
20.1.3.2 Branch Types
The IA32_LBR_x_INFO.BR_TYPE and IA32_LER_INFO.BR_TYPE fields encode
the branch types as shown in Table 20-3.
Table 20-3. IA32_LBR_x_INFO and IA32_LER_INFO Branch Type Encodings
Encoding | Branch Type
0000B | COND
0001B | NEAR_IND_JMP
0010B | NEAR_REL_JMP
0011B | NEAR_IND_CALL
0100B | NEAR_REL_CALL
0101B | NEAR_RET
011xB | Reserved
1xxxB | OTHER_BRANCH
For a list of branch operations that fall into the categories above,
see Table 20-2.
Table 20-2. Branch Type Filtering Details
Branch Type | Operations Recorded
COND | Jcc, J*CXZ, and LOOP*
NEAR_IND_JMP | JMP r/m*
NEAR_REL_JMP | JMP rel*
NEAR_IND_CALL | CALL r/m*
NEAR_REL_CALL | CALL rel* (excluding CALLs to the next sequential IP)
NEAR_RET | RET (0C3H)
OTHER_BRANCH | JMP/CALL ptr*, JMP/CALL m*, RET (0C8H), SYS*,
interrupts, exceptions (other than debug exceptions), IRET, INT3,
INTn, INTO, TSX Abort, EENTER, ERESUME, EEXIT, AEX, INIT, SIPI, RSM
```
Linux kernel can preserve branch type when `save_type` is enabled,
even if CPU does not support Architectural LBR:
https://github.com/torvalds/linux/blob/f09079bd04a924c72d555cd97942d5f8d7eca98c/tools/perf/Documentation/perf-record.txt#L457-L460
> - save_type: save branch type during sampling in case binary is not
available later.
For the platforms with Intel Arch LBR support (12th-Gen+ client or
4th-Gen Xeon+ server), the save branch type is unconditionally enabled
when the taken branch stack sampling is enabled.
Kernel-reported branch type values:
https://github.com/torvalds/linux/blob/8c6bc74c7f8910ed4c969ccec52e98716f98700a/include/uapi/linux/perf_event.h#L251-L269
This information is needed to disambiguate external returns (from
DSO/JIT) to an entry point or a landing pad, when BOLT can't
disassemble the branch source.
This patch adds new pre-aggregated types:
- return trace (R),
- external return fall-through (r).
For such types, the checks for fall-through start (not an entry or
a landing pad) are relaxed.
Depends on #143295.
Test Plan: updated callcont-fallthru.s
To unsubscribe from these emails, change your notification settings at https://github.com/llvm/llvm-project/settings/notifications
More information about the All-commits
mailing list