[PATCH] D127083: [MCA] Introducing incremental SourceMgr and resumable pipeline

Min-Yih Hsu via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Jun 6 15:47:00 PDT 2022


myhsu added a comment.

> Did you notice a perf regression after this change on normal llvm-mca runs with several iterations?

Following up on the potential performance regression caused by abstracting away SourceMgr, here is my experiment:

- Baseline SHA: 8f7b14898fe32f9c41059517a5a3872ef089174b <https://reviews.llvm.org/rG8f7b14898fe32f9c41059517a5a3872ef089174b>
- Host arch / OS: Core i7-8700K / Ubuntu 20.04
- MCA command: `llvm-mca -mcpu=btver2 -mtriple=x86_64-unknown-linux /path/to/llvm-project/llvm/test/tools/llvm-mca/X86/BtVer2/resources-sse2.s -o /dev/null`
- Perf command: `perf stat --repeat=1000 -- <MCA command>`

Here are the baseline numbers:

        48.43 msec task-clock                #    0.997 CPUs utilized            ( +-  0.08% )
            0      context-switches          #    0.008 K/sec                    ( +-  7.25% )
            0      cpu-migrations            #    0.000 K/sec                  
          767      page-faults               #    0.016 M/sec                    ( +-  0.01% )
  220,767,126      cycles                    #    4.558 GHz                      ( +-  0.01% )
  477,655,890      instructions              #    2.16  insn per cycle           ( +-  0.00% )
  112,417,237      branches                  # 2321.072 M/sec                    ( +-  0.00% )
    1,266,955      branch-misses             #    1.13% of all branches          ( +-  0.02% )
  
    0.0485851 +- 0.0000370 seconds time elapsed  ( +-  0.08% )

Here are the numbers after this particular patch is applied:

        47.90 msec task-clock                #    0.997 CPUs utilized            ( +-  0.07% )
            0      context-switches          #    0.003 K/sec                    ( +-  7.75% )
            0      cpu-migrations            #    0.000 K/sec                  
          762      page-faults               #    0.016 M/sec                    ( +-  0.01% )
  220,706,375      cycles                    #    4.608 GHz                      ( +-  0.01% )
  479,569,839      instructions              #    2.17  insn per cycle           ( +-  0.00% )
  112,961,518      branches                  # 2358.488 M/sec                    ( +-  0.00% )
    1,261,302      branch-misses             #    1.12% of all branches          ( +-  0.02% )
  
    0.0480387 +- 0.0000322 seconds time elapsed  ( +-  0.07% )

If we look into the number of instructions, a more stable metric, there is an increase on that, but it only accounts about 0.4%.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D127083/new/

https://reviews.llvm.org/D127083



More information about the llvm-commits mailing list