[llvm] [RISCV] Allow folding vmerge with implicit merge operand when true has tied dest (PR #78565)

Philip Reames via llvm-commits llvm-commits at lists.llvm.org
Fri Jan 19 11:20:46 PST 2024


preames wrote:

Haven't read the patch, but I'm not convinced of the correctness of the justification.

Consider the following case:
VL = 2, SEW = 32, VLEN=128, TU
V1= VADD PT, A, B
// result is [A[0] + B[0], A[1] + B[1], PT[2], PT[3]]
VL=4
V2 = VMERGE undef, V1, C, <true, true, false, true>
// result is [A[0] + B[0], A[1] + B[1], C[2], PT[3]]

The transform would create:
VL = 2, SEW = 32, VLEN=128, TU/MU
V1= VADD C, A, B, <true, true, false, true>
// result is [A[0] + B[0], A[1] + B[1], C[2], C[3]]

Note the difference in the last element.  

Not sure I got my example entirely correct, but the basic idea I'm aiming for is that if the original sequence produces values which consume 4 distinct input registers, there's no folded form for the VADD which can do the same.  Unless I'm missing something?

https://github.com/llvm/llvm-project/pull/78565


More information about the llvm-commits mailing list