[all-commits] [llvm/llvm-project] b5657d: [RISCV] Reverse default assumption about performan...

Wed Jul 10 07:36:17 PDT 2024

  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: b5657d6dc7066156e33bc83e297e534d41731560
      https://github.com/llvm/llvm-project/commit/b5657d6dc7066156e33bc83e297e534d41731560
  Author: Philip Reames <preames at rivosinc.com>
  Date:   2024-07-10 (Wed, 10 Jul 2024)

  Changed paths:
    M llvm/lib/Target/RISCV/RISCVFeatures.td
    M llvm/lib/Target/RISCV/RISCVProcessors.td
    M llvm/test/CodeGen/RISCV/rvv/fixed-vectors-fp-buildvec.ll
    M llvm/test/CodeGen/RISCV/rvv/fixed-vectors-fp-shuffles.ll
    M llvm/test/CodeGen/RISCV/rvv/fixed-vectors-fp-vrgather.ll
    M llvm/test/CodeGen/RISCV/rvv/fixed-vectors-int-buildvec.ll
    M llvm/test/CodeGen/RISCV/rvv/fixed-vectors-int-vrgather.ll
    M llvm/test/CodeGen/RISCV/rvv/fixed-vectors-int.ll
    M llvm/test/CodeGen/RISCV/rvv/fixed-vectors-mask-buildvec.ll
    M llvm/test/CodeGen/RISCV/rvv/fixed-vectors-masked-gather.ll
    M llvm/test/CodeGen/RISCV/rvv/fixed-vectors-masked-scatter.ll
    M llvm/test/CodeGen/RISCV/rvv/fixed-vectors-strided-load-store-asm.ll
    M llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vwadd.ll
    M llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vwaddu.ll
    M llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vwmulsu.ll
    M llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vwsub.ll
    M llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vwsubu.ll
    M llvm/test/CodeGen/RISCV/rvv/vfma-vp-combine.ll
    M llvm/test/CodeGen/RISCV/rvv/vreductions-fp-sdnode.ll
    M llvm/test/CodeGen/RISCV/rvv/vsetvli-insert-crossbb.ll
    M llvm/test/CodeGen/RISCV/rvv/vsplats-fp.ll

  Log Message:
  -----------
  [RISCV] Reverse default assumption about performance of vlseN.v vd, (rs1), x0 (#98205)

Some cores implement an optimization for a strided load with an x0
stride, which results in fewer memory operations being performed then
implied by VL since all address are the same. It seems to be the case
that this is the case only for a minority of available implementations.
We know that sifive-x280 does, but sifive-p670 and spacemit-x60 both do
not.

(To be more precise, measurements on the x60 appear to indicate that a
 stride of x0 has similar latency to a non-zero stride, and that both
 are about twice a vleN.v.  I'm taking this to mean the x0
 case is not optimized.)

We had an existing flag by which a processor could opt out of this
assumption but no upstream users. Instead of adding this flag to the
p670 and x60, this patch reverses the default and adds the opt-in flag
only to the x280.

To unsubscribe from these emails, change your notification settings at https://github.com/llvm/llvm-project/settings/notifications