[all-commits] [llvm/llvm-project] db2c6c: [NFC][X86][MCA] AMD Zen 3: improve MULX test coverage

Roman Lebedev via All-commits all-commits at lists.llvm.org
Fri Aug 27 03:27:48 PDT 2021


  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: db2c6cd99c88018dff26fdb0d39ffa10ea40c4b9
      https://github.com/llvm/llvm-project/commit/db2c6cd99c88018dff26fdb0d39ffa10ea40c4b9
  Author: Roman Lebedev <lebedev.ri at gmail.com>
  Date:   2021-08-27 (Fri, 27 Aug 2021)

  Changed paths:
    A llvm/test/tools/llvm-mca/X86/Znver3/mulx-lo-reg-use.s
    A llvm/test/tools/llvm-mca/X86/Znver3/mulx-read-advance.s
    A llvm/test/tools/llvm-mca/X86/Znver3/mulx-same-regs.s

  Log Message:
  -----------
  [NFC][X86][MCA] AMD Zen 3: improve MULX test coverage

Latency for MULX isn't right


  Commit: 0f04936a2d4e3ec7db681547876f7669c445af0e
      https://github.com/llvm/llvm-project/commit/0f04936a2d4e3ec7db681547876f7669c445af0e
  Author: Roman Lebedev <lebedev.ri at gmail.com>
  Date:   2021-08-27 (Fri, 27 Aug 2021)

  Changed paths:
    M llvm/lib/Target/X86/X86ScheduleZnver3.td
    M llvm/test/tools/llvm-mca/X86/Znver3/mulx-lo-reg-use.s
    M llvm/test/tools/llvm-mca/X86/Znver3/mulx-read-advance.s
    M llvm/test/tools/llvm-mca/X86/cv_fpo_directive_no_segfault.s

  Log Message:
  -----------
  [X86] AMD Zen 3: MULX produces low part of the result in 3cy, +1cy for high part

As per llvm-exegesis measurements.


  Commit: d4d459e7475b4bb0d15280f12ed669342fa5edcd
      https://github.com/llvm/llvm-project/commit/d4d459e7475b4bb0d15280f12ed669342fa5edcd
  Author: Roman Lebedev <lebedev.ri at gmail.com>
  Date:   2021-08-27 (Fri, 27 Aug 2021)

  Changed paths:
    M llvm/lib/Target/X86/X86ScheduleZnver3.td
    M llvm/test/tools/llvm-mca/X86/Znver3/mulx-hi-read-advance.s
    M llvm/test/tools/llvm-mca/X86/Znver3/mulx-read-advance.s
    M llvm/test/tools/llvm-mca/X86/Znver3/resources-bmi2.s

  Log Message:
  -----------
  [X86] AMD Zen 3: MULX w/ mem operand has the same throughput as with reg op

Exegesis is faulty and sometimes when measuring throughput^-1
produces snippets that have loop-carried dependencies,
which must be what caused me to incorrectly measure it originally.

After looking much more carefully, the inverse throughput should match
that of the MULX w/ reg op.

As per llvm-exegesis measurements.


Compare: https://github.com/llvm/llvm-project/compare/692ebe539537...d4d459e7475b


More information about the All-commits mailing list