[PATCH] D77422: [llvm-exegesis] Add benchmark mode that uses LBR for more precise measurements.

Vy Nguyen via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Jun 3 12:05:39 PDT 2020


oontvoo added inline comments.


================
Comment at: llvm/tools/llvm-exegesis/llvm-exegesis.cpp:291
+  if (!ExpectedHostCpu.empty()) {
+    // The actual name could include variations, such as "skylake" vs
+    // "skylake-avx512" so we don't look for exact match.
----------------
ondrasej wrote:
> courbet wrote:
> > This is a bit brittle, because we could imagine the name of some unrelated CPUs being substrings of others. What about having  a repeated option `--allowed-host-cpu=skylake --allowed-host-cpu=skylake-avx512 --allowed-host-cpu=whateverlake`, and check that the exact value is one of these ?
> Ideally, this should be defined in terms of CPU features (e.g. CPUID bits). Even better - each target should know which counters it supports, based on its platform-specific feature discovery mechanism. I understand that this would be a huge change for this CL, but we should at least have a FIXME here.
Actually, I had a question here that I was gonna ask in an email But here goes.

We know that the LBR formats could be queried from the perf-cap MSR
Specifically, we want: `MSR IA32_PERF_CAPABILITIES[5:0]" == 000110B`  (bit `59...63` is not relevant)

If I'm not mistaken, `perf_event_mmap_page::capabilities` should(?)  give us that.
Except, when I run this on both Broadwell and Skylake, the `capabilities` field has value of `30` for both platforms. (It  shouldn't be).   Of course, the difference here is that the `cycle` entries are all zeroes on Broadwell .
I haven't looked in details, so I don't know if this is some implementation detail of the libpfm or if we're just mis-interpreting the SDM here.

Thoughts?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D77422/new/

https://reviews.llvm.org/D77422





More information about the llvm-commits mailing list