[llvm-bugs] [Bug 50453] New: [exegesis] Problem with partial register writes

via llvm-bugs llvm-bugs at lists.llvm.org
Mon May 24 03:18:07 PDT 2021


https://bugs.llvm.org/show_bug.cgi?id=50453

            Bug ID: 50453
           Summary: [exegesis] Problem with partial register writes
           Product: tools
           Version: trunk
          Hardware: PC
                OS: Linux
            Status: NEW
          Severity: enhancement
          Priority: P
         Component: llvm-exegesis
          Assignee: unassignedbugs at nondot.org
          Reporter: lebedev.ri at gmail.com
                CC: clement.courbet at gmail.com, gchatelet at google.com,
                    llvm-bugs at lists.llvm.org

This was initially brought up by Simon Pilgrim.
Currently, znver3 and bdver2 sched models overestimate latencies/throughput
for i8/i16 instructions. For example, a multiplication implicitly uses low i8
of the implicit write i16 register, which causes extra 1cy latency:

$ ./bin/llvm-exegesis -num-repetitions=10000 -mode=latency
--snippets-file=/tmp/test.s
Check generated assembly with: /usr/bin/objdump -d /tmp/snippet-b1ab3e.o
---
mode:            latency
key:
  instructions:
    - 'MUL8r DL'
  config:          ''
  register_initial_values:
    - 'DL=0x0'
    - 'AL=0x0'
cpu_name:        znver3
llvm_triple:     x86_64-unknown-linux-gnu
num_repetitions: 10000
measurements:
  - { key: latency, value: 3.0026, per_snippet_value: 3.0026 }
error:           ''
info:            ''
assembled_snippet:
B200B000F6E2F6E2F6E2F6E2F6E2F6E2F6E2F6E2F6E2F6E2F6E2F6E2F6E2F6E2F6E2F6E2C3
...
$ ./bin/llvm-exegesis -num-repetitions=10000 -mode=latency
--snippets-file=/tmp/test.s
Check generated assembly with: /usr/bin/objdump -d /tmp/snippet-b973e8.o
---
mode:            latency
key:
  instructions:
    - 'XOR8rr AL AL AL'
  config:          ''
  register_initial_values:
    - 'DL=0x0'
    - 'AL=0x0'
cpu_name:        znver3
llvm_triple:     x86_64-unknown-linux-gnu
num_repetitions: 10000
measurements:
  - { key: latency, value: 1.0031, per_snippet_value: 1.0031 }
error:           ''
info:            ''
assembled_snippet:
B200B00030C030C030C030C030C030C030C030C030C030C030C030C030C030C030C030C0C3
...
$ ./bin/llvm-exegesis -num-repetitions=10000 -mode=latency
--snippets-file=/tmp/test.s
Check generated assembly with: /usr/bin/objdump -d /tmp/snippet-9093d7.o
---
mode:            latency
key:
  instructions:
    - 'XOR8rr AL AL AL'
    - 'MUL8r DL'
  config:          ''
  register_initial_values:
    - 'DL=0x0'
    - 'AL=0x0'
cpu_name:        znver3
llvm_triple:     x86_64-unknown-linux-gnu
num_repetitions: 10000
measurements:
  - { key: latency, value: 2.003, per_snippet_value: 4.006 }
error:           ''
info:            ''
assembled_snippet:
B200B00030C0F6E230C0F6E230C0F6E230C0F6E230C0F6E230C0F6E230C0F6E230C0F6E2C3
...



$ ./bin/llvm-exegesis -num-repetitions=10000 -mode=inverse_throughput
--snippets-file=/tmp/test.s
Check generated assembly with: /usr/bin/objdump -d /tmp/snippet-21da4b.o
---
mode:            inverse_throughput
key:
  instructions:
    - 'MUL8r DL'
  config:          ''
  register_initial_values:
    - 'DL=0x0'
    - 'AL=0x0'
cpu_name:        znver3
llvm_triple:     x86_64-unknown-linux-gnu
num_repetitions: 10000
measurements:
  - { key: inverse_throughput, value: 3.0029, per_snippet_value: 3.0029 }
error:           ''
info:            ''
assembled_snippet:
B200B000F6E2F6E2F6E2F6E2F6E2F6E2F6E2F6E2F6E2F6E2F6E2F6E2F6E2F6E2F6E2F6E2C3
...
$ ./bin/llvm-exegesis -num-repetitions=10000 -mode=inverse_throughput
--snippets-file=/tmp/test.s
Check generated assembly with: /usr/bin/objdump -d /tmp/snippet-2e2e4f.o
---
mode:            inverse_throughput
key:
  instructions:
    - 'XOR8rr AL AL AL'
  config:          ''
  register_initial_values:
    - 'DL=0x0'
    - 'AL=0x0'
cpu_name:        znver3
llvm_triple:     x86_64-unknown-linux-gnu
num_repetitions: 10000
measurements:
  - { key: inverse_throughput, value: 1.0028, per_snippet_value: 1.0028 }
error:           ''
info:            ''
assembled_snippet:
B200B00030C030C030C030C030C030C030C030C030C030C030C030C030C030C030C030C0C3
...
$ ./bin/llvm-exegesis -num-repetitions=10000 -mode=inverse_throughput
--snippets-file=/tmp/test.s
Check generated assembly with: /usr/bin/objdump -d /tmp/snippet-43fb9e.o
---
mode:            inverse_throughput
key:
  instructions:
    - 'XOR8rr AL AL AL'
    - 'MUL8r DL'
  config:          ''
  register_initial_values:
    - 'DL=0x0'
    - 'AL=0x0'
cpu_name:        znver3
llvm_triple:     x86_64-unknown-linux-gnu
num_repetitions: 10000
measurements:
  - { key: inverse_throughput, value: 2.0031, per_snippet_value: 4.0062 }
error:           ''
info:            ''
assembled_snippet:
B200B00030C0F6E230C0F6E230C0F6E230C0F6E230C0F6E230C0F6E230C0F6E230C0F6E2C3
...

What would be the best way to workaround that?

-- 
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20210524/13544359/attachment.html>


More information about the llvm-bugs mailing list