[llvm] [RISCV] Fold vector shift of sext/zext to widening multiply (PR #121563)
Craig Topper via llvm-commits
llvm-commits at lists.llvm.org
Tue Jan 28 11:06:44 PST 2025
topperc wrote:
> > I've heard that this transform won't be profitable on other CPUs, so I added a commit that enables it on BPI's SpacemiT X60 only. Basing on https://camel-cdr.github.io/rvv-bench-results/canmv_k230/ perhaps K230 too, but I don't have one to confirm.
> > The sole presence of Zvbb doesn't preclude this transform, because `vwsll.vi` only does zero-extension, while widening multiply comes in the sign-extending variant as well.
>
> It's probably profitable on SiFive x280. The vzext/vsext can produce DLEN bits per cycle where DLEN=VLEN/2. The latency until the first DLEN is ready is 4. The shift also produces DLEN bits per cycle. The latency until the first DLEN is ready is 8. The widening multiply produces DLEN*2 bits per cycle. The first 2 DLENs complete in 8 cycles.
I might have the latency wrong shifts on x280. The scheduler model says 4, but I have other docs internally that say 8. I'm confirming.
https://github.com/llvm/llvm-project/pull/121563
More information about the llvm-commits
mailing list