[llvm] [RISCV] Use ri.vunzip2{a,b} for e64 fixed length deinterleave(2) shuffles (PR #137217)
Philip Reames via llvm-commits
llvm-commits at lists.llvm.org
Tue Jun 3 09:56:21 PDT 2025
preames wrote:
> Is there a reason why this doesn't handle < e64? Is it that hardware expected to have similar performance with a single vnsrl.wx vs ri.vunzip and we want to prefer the standard extension?
Yes, though also it introduces less test churn and divergence. I'm not strongly opinionated on this, and could easily see this flipping the other direction at some point if results merit. The vnsrl allows partial overlap (which ri.vunzip doesn't), but requires the source to be aligned register group instead of two independent registers. So, a bit unclear what's net profitable.
https://github.com/llvm/llvm-project/pull/137217
More information about the llvm-commits
mailing list