[PATCH] D139283: [llvm-exegesis] parallel snippet generator: avoid Read-After-Write pitfail for instrs w/ tied variables

Roman Lebedev via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Dec 5 08:13:29 PST 2022


lebedev.ri added a comment.

In D139283#3970982 <https://reviews.llvm.org/D139283#3970982>, @RKSimon wrote:

> random thought - how is exegesis handling the SSE PBLENDVB/BLENDVPS/BLENDVPD instructions? They have an explicit dependency on xmm0 as well which might also be causing issues

Please let me know if this is bad, or i'm looking at the wrong instruction:

  $ ./bin/llvm-exegesis --mode=inverse_throughput --opcode-name=BLENDVPSrr0 --max-configs-per-opcode=9182
  Check generated assembly with: /usr/bin/objdump -d /tmp/snippet-77314f.o
  ---
  mode:            inverse_throughput
  key:
    instructions:
      - 'BLENDVPSrr0 XMM1 XMM1 XMM5'
      - 'BLENDVPSrr0 XMM12 XMM12 XMM15'
      - 'BLENDVPSrr0 XMM14 XMM14 XMM9'
      - 'BLENDVPSrr0 XMM3 XMM3 XMM8'
      - 'BLENDVPSrr0 XMM7 XMM7 XMM4'
      - 'BLENDVPSrr0 XMM13 XMM13 XMM6'
      - 'BLENDVPSrr0 XMM10 XMM10 XMM8'
      - 'BLENDVPSrr0 XMM11 XMM11 XMM8'
      - 'BLENDVPSrr0 XMM2 XMM2 XMM8'
    config:          ''
    register_initial_values:
      - 'XMM1=0x0'
      - 'XMM5=0x0'
      - 'XMM0=0x0'
      - 'XMM12=0x0'
      - 'XMM15=0x0'
      - 'XMM14=0x0'
      - 'XMM9=0x0'
      - 'XMM3=0x0'
      - 'XMM8=0x0'
      - 'XMM7=0x0'
      - 'XMM4=0x0'
      - 'XMM13=0x0'
      - 'XMM6=0x0'
      - 'XMM10=0x0'
      - 'XMM11=0x0'
      - 'XMM2=0x0'
  cpu_name:        znver3
  llvm_triple:     x86_64-unknown-linux-gnu
  num_repetitions: 10000
  measurements:
    - { key: inverse_throughput, value: 0.5294, per_snippet_value: 4.7646 }
  error:           ''
  info:            instruction has tied variables, avoiding Read-After-Write issue, picking random def and use registers not aliasing each other, for uses, randomizing registers
  assembled_snippet: 4883EC10C7042400000000C744240400000000C744240800000000C744240C00000000C5FA6F0C244883C4104883EC10C7042400000000C744240400000000C744240800000000C744240C00000000C5FA6F2C244883C4104883EC10C7042400000000C744240400000000C744240800000000C744240C00000000C5FA6F04244883C4104883EC10C7042400000000C744240400000000C744240800000000C744240C00000000C57A6F24244883C4104883EC10C7042400000000C744240400000000C744240800000000C744240C00000000C57A6F3C244883C4104883EC10C7042400000000C744240400000000C744240800000000C744240C00000000C57A6F34244883C4104883EC10C7042400000000C744240400000000C744240800000000C744240C00000000C57A6F0C244883C4104883EC10C7042400000000C744240400000000C744240800000000C744240C00000000C5FA6F1C244883C4104883EC10C7042400000000C744240400000000C744240800000000C744240C00000000C57A6F04244883C4104883EC10C7042400000000C744240400000000C744240800000000C744240C00000000C5FA6F3C244883C4104883EC10C7042400000000C744240400000000C744240800000000C744240C00000000C5FA6F24244883C4104883EC10C7042400000000C744240400000000C744240800000000C744240C00000000C57A6F2C244883C4104883EC10C7042400000000C744240400000000C744240800000000C744240C00000000C5FA6F34244883C4104883EC10C7042400000000C744240400000000C744240800000000C744240C00000000C57A6F14244883C4104883EC10C7042400000000C744240400000000C744240800000000C744240C00000000C57A6F1C244883C4104883EC10C7042400000000C744240400000000C744240800000000C744240C00000000C5FA6F14244883C410660F3814CD66450F3814E766450F3814F166410F3814D8660F3814FC66440F3814EE66450F3814D066450F3814D866410F3814D0660F3814CD66450F3814E766450F3814F166410F3814D8660F3814FC66440F3814EE66450F3814D066450F3814D866410F3814D0660F3814CD66450F3814E766450F3814F166410F3814D8660F3814FC66440F3814EE66450F3814D066450F3814D866410F3814D0660F3814CD66450F3814E766450F3814F166410F3814D8660F3814FC66440F3814EE66450F3814D066450F3814D866410F3814D0C3
  ...
  Check generated assembly with: /usr/bin/objdump -d /tmp/snippet-d0b778.o
  ---
  mode:            inverse_throughput
  key:
    instructions:
      - 'BLENDVPSrr0 XMM15 XMM15 XMM9'
      - 'BLENDVPSrr0 XMM6 XMM6 XMM9'
      - 'BLENDVPSrr0 XMM13 XMM13 XMM9'
      - 'BLENDVPSrr0 XMM5 XMM5 XMM9'
      - 'BLENDVPSrr0 XMM11 XMM11 XMM9'
      - 'BLENDVPSrr0 XMM3 XMM3 XMM9'
      - 'BLENDVPSrr0 XMM10 XMM10 XMM9'
      - 'BLENDVPSrr0 XMM12 XMM12 XMM9'
      - 'BLENDVPSrr0 XMM8 XMM8 XMM9'
      - 'BLENDVPSrr0 XMM14 XMM14 XMM9'
      - 'BLENDVPSrr0 XMM1 XMM1 XMM9'
      - 'BLENDVPSrr0 XMM4 XMM4 XMM9'
      - 'BLENDVPSrr0 XMM7 XMM7 XMM9'
      - 'BLENDVPSrr0 XMM2 XMM2 XMM9'
    config:          ''
    register_initial_values:
      - 'XMM15=0x0'
      - 'XMM9=0x0'
      - 'XMM0=0x0'
      - 'XMM6=0x0'
      - 'XMM13=0x0'
      - 'XMM5=0x0'
      - 'XMM11=0x0'
      - 'XMM3=0x0'
      - 'XMM10=0x0'
      - 'XMM12=0x0'
      - 'XMM8=0x0'
      - 'XMM14=0x0'
      - 'XMM1=0x0'
      - 'XMM4=0x0'
      - 'XMM7=0x0'
      - 'XMM2=0x0'
  cpu_name:        znver3
  llvm_triple:     x86_64-unknown-linux-gnu
  num_repetitions: 10000
  measurements:
    - { key: inverse_throughput, value: 0.5291, per_snippet_value: 7.4074 }
  error:           ''
  info:            instruction has tied variables, avoiding Read-After-Write issue, picking random def and use registers not aliasing each other, for uses, one unique register for each position
  assembled_snippet: 4883EC10C7042400000000C744240400000000C744240800000000C744240C00000000C57A6F3C244883C4104883EC10C7042400000000C744240400000000C744240800000000C744240C00000000C57A6F0C244883C4104883EC10C7042400000000C744240400000000C744240800000000C744240C00000000C5FA6F04244883C4104883EC10C7042400000000C744240400000000C744240800000000C744240C00000000C5FA6F34244883C4104883EC10C7042400000000C744240400000000C744240800000000C744240C00000000C57A6F2C244883C4104883EC10C7042400000000C744240400000000C744240800000000C744240C00000000C5FA6F2C244883C4104883EC10C7042400000000C744240400000000C744240800000000C744240C00000000C57A6F1C244883C4104883EC10C7042400000000C744240400000000C744240800000000C744240C00000000C5FA6F1C244883C4104883EC10C7042400000000C744240400000000C744240800000000C744240C00000000C57A6F14244883C4104883EC10C7042400000000C744240400000000C744240800000000C744240C00000000C57A6F24244883C4104883EC10C7042400000000C744240400000000C744240800000000C744240C00000000C57A6F04244883C4104883EC10C7042400000000C744240400000000C744240800000000C744240C00000000C57A6F34244883C4104883EC10C7042400000000C744240400000000C744240800000000C744240C00000000C5FA6F0C244883C4104883EC10C7042400000000C744240400000000C744240800000000C744240C00000000C5FA6F24244883C4104883EC10C7042400000000C744240400000000C744240800000000C744240C00000000C5FA6F3C244883C4104883EC10C7042400000000C744240400000000C744240800000000C744240C00000000C5FA6F14244883C41066450F3814F966410F3814F166450F3814E966410F3814E966450F3814D966410F3814D966450F3814D166450F3814E166450F3814C166450F3814F166410F3814C966410F3814E166410F3814F966410F3814D166450F3814F966410F3814F166450F3814E966410F3814E966450F3814D966410F3814D966450F3814D166450F3814E166450F3814C166450F3814F166410F3814C966410F3814E166410F3814F966410F3814D166450F3814F966410F3814F166450F3814E966410F3814E966450F3814D966410F3814D966450F3814D166450F3814E166450F3814C166450F3814F166410F3814C966410F3814E166410F3814F966410F3814D166450F3814F966410F3814F166450F3814E966410F3814E966450F3814D966410F3814D966450F3814D166450F3814E166450F3814C166450F3814F166410F3814C966410F3814E166410F3814F966410F3814D1C3
  ...
  Check generated assembly with: /usr/bin/objdump -d /tmp/snippet-9024ea.o
  ---
  mode:            inverse_throughput
  key:
    instructions:
      - 'BLENDVPSrr0 XMM7 XMM7 XMM5'
      - 'BLENDVPSrr0 XMM15 XMM15 XMM5'
      - 'BLENDVPSrr0 XMM9 XMM9 XMM5'
      - 'BLENDVPSrr0 XMM14 XMM14 XMM5'
      - 'BLENDVPSrr0 XMM12 XMM12 XMM5'
      - 'BLENDVPSrr0 XMM3 XMM3 XMM5'
      - 'BLENDVPSrr0 XMM1 XMM1 XMM5'
      - 'BLENDVPSrr0 XMM6 XMM6 XMM5'
      - 'BLENDVPSrr0 XMM4 XMM4 XMM5'
      - 'BLENDVPSrr0 XMM10 XMM10 XMM5'
      - 'BLENDVPSrr0 XMM8 XMM8 XMM5'
      - 'BLENDVPSrr0 XMM13 XMM13 XMM5'
      - 'BLENDVPSrr0 XMM11 XMM11 XMM5'
      - 'BLENDVPSrr0 XMM2 XMM2 XMM5'
    config:          ''
    register_initial_values:
      - 'XMM7=0x0'
      - 'XMM5=0x0'
      - 'XMM0=0x0'
      - 'XMM15=0x0'
      - 'XMM9=0x0'
      - 'XMM14=0x0'
      - 'XMM12=0x0'
      - 'XMM3=0x0'
      - 'XMM1=0x0'
      - 'XMM6=0x0'
      - 'XMM4=0x0'
      - 'XMM10=0x0'
      - 'XMM8=0x0'
      - 'XMM13=0x0'
      - 'XMM11=0x0'
      - 'XMM2=0x0'
  cpu_name:        znver3
  llvm_triple:     x86_64-unknown-linux-gnu
  num_repetitions: 10000
  measurements:
    - { key: inverse_throughput, value: 0.5294, per_snippet_value: 7.4116 }
  error:           ''
  info:            instruction has tied variables, avoiding Read-After-Write issue, picking random def and use registers not aliasing each other, for uses, reusing the same register for all positions
  assembled_snippet: 4883EC10C7042400000000C744240400000000C744240800000000C744240C00000000C5FA6F3C244883C4104883EC10C7042400000000C744240400000000C744240800000000C744240C00000000C5FA6F2C244883C4104883EC10C7042400000000C744240400000000C744240800000000C744240C00000000C5FA6F04244883C4104883EC10C7042400000000C744240400000000C744240800000000C744240C00000000C57A6F3C244883C4104883EC10C7042400000000C744240400000000C744240800000000C744240C00000000C57A6F0C244883C4104883EC10C7042400000000C744240400000000C744240800000000C744240C00000000C57A6F34244883C4104883EC10C7042400000000C744240400000000C744240800000000C744240C00000000C57A6F24244883C4104883EC10C7042400000000C744240400000000C744240800000000C744240C00000000C5FA6F1C244883C4104883EC10C7042400000000C744240400000000C744240800000000C744240C00000000C5FA6F0C244883C4104883EC10C7042400000000C744240400000000C744240800000000C744240C00000000C5FA6F34244883C4104883EC10C7042400000000C744240400000000C744240800000000C744240C00000000C5FA6F24244883C4104883EC10C7042400000000C744240400000000C744240800000000C744240C00000000C57A6F14244883C4104883EC10C7042400000000C744240400000000C744240800000000C744240C00000000C57A6F04244883C4104883EC10C7042400000000C744240400000000C744240800000000C744240C00000000C57A6F2C244883C4104883EC10C7042400000000C744240400000000C744240800000000C744240C00000000C57A6F1C244883C4104883EC10C7042400000000C744240400000000C744240800000000C744240C00000000C5FA6F14244883C410660F3814FD66440F3814FD66440F3814CD66440F3814F566440F3814E5660F3814DD660F3814CD660F3814F5660F3814E566440F3814D566440F3814C566440F3814ED66440F3814DD660F3814D5660F3814FD66440F3814FD66440F3814CD66440F3814F566440F3814E5660F3814DD660F3814CD660F3814F5660F3814E566440F3814D566440F3814C566440F3814ED66440F3814DD660F3814D5660F3814FD66440F3814FD66440F3814CD66440F3814F566440F3814E5660F3814DD660F3814CD660F3814F5660F3814E566440F3814D566440F3814C566440F3814ED66440F3814DD660F3814D5660F3814FD66440F3814FD66440F3814CD66440F3814F566440F3814E5660F3814DD660F3814CD660F3814F5660F3814E566440F3814D566440F3814C566440F3814ED66440F3814DD660F3814D5C3
  ...


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D139283/new/

https://reviews.llvm.org/D139283



More information about the llvm-commits mailing list