[all-commits] [llvm/llvm-project] b5657d: [RISCV] Reverse default assumption about performan...
Philip Reames via All-commits
all-commits at lists.llvm.org
Wed Jul 10 07:36:17 PDT 2024
Branch: refs/heads/main
Home: https://github.com/llvm/llvm-project
Commit: b5657d6dc7066156e33bc83e297e534d41731560
https://github.com/llvm/llvm-project/commit/b5657d6dc7066156e33bc83e297e534d41731560
Author: Philip Reames <preames at rivosinc.com>
Date: 2024-07-10 (Wed, 10 Jul 2024)
Changed paths:
M llvm/lib/Target/RISCV/RISCVFeatures.td
M llvm/lib/Target/RISCV/RISCVProcessors.td
M llvm/test/CodeGen/RISCV/rvv/fixed-vectors-fp-buildvec.ll
M llvm/test/CodeGen/RISCV/rvv/fixed-vectors-fp-shuffles.ll
M llvm/test/CodeGen/RISCV/rvv/fixed-vectors-fp-vrgather.ll
M llvm/test/CodeGen/RISCV/rvv/fixed-vectors-int-buildvec.ll
M llvm/test/CodeGen/RISCV/rvv/fixed-vectors-int-vrgather.ll
M llvm/test/CodeGen/RISCV/rvv/fixed-vectors-int.ll
M llvm/test/CodeGen/RISCV/rvv/fixed-vectors-mask-buildvec.ll
M llvm/test/CodeGen/RISCV/rvv/fixed-vectors-masked-gather.ll
M llvm/test/CodeGen/RISCV/rvv/fixed-vectors-masked-scatter.ll
M llvm/test/CodeGen/RISCV/rvv/fixed-vectors-strided-load-store-asm.ll
M llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vwadd.ll
M llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vwaddu.ll
M llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vwmulsu.ll
M llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vwsub.ll
M llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vwsubu.ll
M llvm/test/CodeGen/RISCV/rvv/vfma-vp-combine.ll
M llvm/test/CodeGen/RISCV/rvv/vreductions-fp-sdnode.ll
M llvm/test/CodeGen/RISCV/rvv/vsetvli-insert-crossbb.ll
M llvm/test/CodeGen/RISCV/rvv/vsplats-fp.ll
Log Message:
-----------
[RISCV] Reverse default assumption about performance of vlseN.v vd, (rs1), x0 (#98205)
Some cores implement an optimization for a strided load with an x0
stride, which results in fewer memory operations being performed then
implied by VL since all address are the same. It seems to be the case
that this is the case only for a minority of available implementations.
We know that sifive-x280 does, but sifive-p670 and spacemit-x60 both do
not.
(To be more precise, measurements on the x60 appear to indicate that a
stride of x0 has similar latency to a non-zero stride, and that both
are about twice a vleN.v. I'm taking this to mean the x0
case is not optimized.)
We had an existing flag by which a processor could opt out of this
assumption but no upstream users. Instead of adding this flag to the
p670 and x60, this patch reverses the default and adds the opt-in flag
only to the x280.
To unsubscribe from these emails, change your notification settings at https://github.com/llvm/llvm-project/settings/notifications
More information about the All-commits
mailing list