[llvm] [RISCV] Use ri.vunzip2{a,b} for e64 fixed length deinterleave(2) shuffles (PR #137217)

Tue Jun 3 09:56:21 PDT 2025

preames wrote:

> Is there a reason why this doesn't handle < e64? Is it that hardware expected to have similar performance with a single vnsrl.wx vs ri.vunzip and we want to prefer the standard extension?

Yes, though also it introduces less test churn and divergence.  I'm not strongly opinionated on this, and could easily see this flipping the other direction at some point if results merit.  The vnsrl allows partial overlap (which ri.vunzip doesn't), but requires the source to be aligned register group instead of two independent registers. So, a bit unclear what's net profitable.  

https://github.com/llvm/llvm-project/pull/137217