[PATCH] D77422: [llvm-exegesis] Add benchmark mode that uses LBR for more precise measurements.
Vy Nguyen via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Wed Jun 3 12:05:39 PDT 2020
oontvoo added inline comments.
================
Comment at: llvm/tools/llvm-exegesis/llvm-exegesis.cpp:291
+ if (!ExpectedHostCpu.empty()) {
+ // The actual name could include variations, such as "skylake" vs
+ // "skylake-avx512" so we don't look for exact match.
----------------
ondrasej wrote:
> courbet wrote:
> > This is a bit brittle, because we could imagine the name of some unrelated CPUs being substrings of others. What about having a repeated option `--allowed-host-cpu=skylake --allowed-host-cpu=skylake-avx512 --allowed-host-cpu=whateverlake`, and check that the exact value is one of these ?
> Ideally, this should be defined in terms of CPU features (e.g. CPUID bits). Even better - each target should know which counters it supports, based on its platform-specific feature discovery mechanism. I understand that this would be a huge change for this CL, but we should at least have a FIXME here.
Actually, I had a question here that I was gonna ask in an email But here goes.
We know that the LBR formats could be queried from the perf-cap MSR
Specifically, we want: `MSR IA32_PERF_CAPABILITIES[5:0]" == 000110B` (bit `59...63` is not relevant)
If I'm not mistaken, `perf_event_mmap_page::capabilities` should(?) give us that.
Except, when I run this on both Broadwell and Skylake, the `capabilities` field has value of `30` for both platforms. (It shouldn't be). Of course, the difference here is that the `cycle` entries are all zeroes on Broadwell .
I haven't looked in details, so I don't know if this is some implementation detail of the libpfm or if we're just mis-interpreting the SDM here.
Thoughts?
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D77422/new/
https://reviews.llvm.org/D77422
More information about the llvm-commits
mailing list