[llvm] [BOLT][AArch64] Introduce SPE mode in BasicAggregation (PR #120741)
via llvm-commits
llvm-commits at lists.llvm.org
Mon Jan 20 09:34:53 PST 2025
mikewilliams-arm wrote:
> is there a way to configure SPE to only collect taken branches?
For the avoidance of doubt, and benefit of anyone finding this and reading it out of context, you *can* configure SPE to collect only taken branches, but only from FEAT_SPEv1p2. That's a relatively new feature in the field. From looking at the kernel sources, you need to check for `/sys/devices/arm_spe_0/format/inv_event_filter` and the syntax would be something like `perf record -e arm_spe_0/branch_filter=1,inv_event_filter=64/`. (I might be wrong - I don't have access to such a system.)
You can always do this filtering post-hoc in software. `perf record -e arm_spe_0/branch_filter=1/` should work on all SPE implementations, and according to Google's AI, 60% of branches are taken, so it's about a 66% overhead to store all the not taken branches and filter them out. `perf script --inject=b` used to do a poor job of preserving all the branch information through the injected events, making this harder to do. I believe that is something being looked into, if it's not already addressed.
However, even so, each sampled branch is exactly that - a single sampled branch. It does not collect sequences of branches other than through the aforementioned optional PBT extension. So, you can only infer that where you came from and where you branched to were executed.
https://github.com/llvm/llvm-project/pull/120741
More information about the llvm-commits
mailing list