[PATCH] D102522: [llvm-exegesis] Loop unrolling for loop snippet repetitor mode

Roman Lebedev via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri May 14 12:08:46 PDT 2021


lebedev.ri created this revision.
lebedev.ri added reviewers: courbet, gchatelet, RKSimon.
lebedev.ri added a project: LLVM.
Herald added subscribers: mstojanovic, pengfei.
lebedev.ri requested review of this revision.

I really needed this, like, factually, yesterday.

Consider the following example:

  $ ./bin/llvm-exegesis --mode=inverse_throughput --snippets-file=/tmp/snippet.s --num-repetitions=1000000 --repetition-mode=duplicate
  Check generated assembly with: /usr/bin/objdump -d /tmp/snippet-4a7e50.o
  ---
  mode:            inverse_throughput
  key:
    instructions:
      - 'VPXORYrr YMM0 YMM0 YMM0'
    config:          ''
    register_initial_values: []
  cpu_name:        znver3
  llvm_triple:     x86_64-unknown-linux-gnu
  num_repetitions: 1000000
  measurements:
    - { key: inverse_throughput, value: 0.31025, per_snippet_value: 0.31025 }
  error:           ''
  info:            ''
  assembled_snippet: C5FDEFC0C5FDEFC0C5FDEFC0C5FDEFC0C5FDEFC0C5FDEFC0C5FDEFC0C5FDEFC0C5FDEFC0C5FDEFC0C5FDEFC0C5FDEFC0C5FDEFC0C5FDEFC0C5FDEFC0C5FDEFC0C3
  ...

What does it tell us?
So wait, it can only execute ~3 x86 AVX YMM PXOR zero-idioms per cycle?
That doesn't seem right. That's even less than there are pipes supporting this type of op.

Now, second example:

  $ ./bin/llvm-exegesis --mode=inverse_throughput --snippets-file=/tmp/snippet.s --num-repetitions=1000000 --repetition-mode=loop
  Check generated assembly with: /usr/bin/objdump -d /tmp/snippet-2418b5.o
  ---
  mode:            inverse_throughput
  key:
    instructions:
      - 'VPXORYrr YMM0 YMM0 YMM0'
    config:          ''
    register_initial_values: []
  cpu_name:        znver3
  llvm_triple:     x86_64-unknown-linux-gnu
  num_repetitions: 1000000
  measurements:
    - { key: inverse_throughput, value: 1.00011, per_snippet_value: 1.00011 }
  error:           ''
  info:            ''
  assembled_snippet: 49B80800000000000000C5FDEFC0C5FDEFC04983C0FF75F2C3
  ...

Now that's just worse. Due to the looping, the throughput completely collapsed,
and now we can only do a single instruction/cycle!?

That's not great.
And final example:

  $ ./bin/llvm-exegesis --mode=inverse_throughput --snippets-file=/tmp/snippet.s --num-repetitions=1000000 --repetition-mode=loop --loop-unroll-factor=1000
  Check generated assembly with: /usr/bin/objdump -d /tmp/snippet-c402e2.o
  ---
  mode:            inverse_throughput
  key:
    instructions:
      - 'VPXORYrr YMM0 YMM0 YMM0'
    config:          ''
    register_initial_values: []
  cpu_name:        znver3
  llvm_triple:     x86_64-unknown-linux-gnu
  num_repetitions: 1000000
  measurements:
    - { key: inverse_throughput, value: 0.167087, per_snippet_value: 0.167087 }
  error:           ''
  info:            ''
  assembled_snippet: 49B80800000000000000C5FDEFC0C5FDEFC04983C0FF75F2C3
  ...

So if we merge the previous two approaches, do duplicate this single-instruction snippet 1000x,
and run a loop with 1000 iterations over that duplicated/unrolled snippet,
the measured throughput goes through the roof, up to 5.9 instructions/cycle,
which finally tells us that this idiom is zero-cycle!


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D102522

Files:
  llvm/docs/CommandGuide/llvm-exegesis.rst
  llvm/tools/llvm-exegesis/lib/BenchmarkResult.h
  llvm/tools/llvm-exegesis/lib/BenchmarkRunner.cpp
  llvm/tools/llvm-exegesis/lib/BenchmarkRunner.h
  llvm/tools/llvm-exegesis/lib/SnippetRepetitor.cpp
  llvm/tools/llvm-exegesis/lib/SnippetRepetitor.h
  llvm/tools/llvm-exegesis/llvm-exegesis.cpp
  llvm/unittests/tools/llvm-exegesis/X86/SnippetRepetitorTest.cpp

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D102522.345524.patch
Type: text/x-patch
Size: 12230 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20210514/f3595290/attachment.bin>


More information about the llvm-commits mailing list