[llvm] [RISCV] Allow folding vmerge with implicit merge operand when true has tied dest (PR #78565)
Luke Lau via llvm-commits
llvm-commits at lists.llvm.org
Mon Jan 22 07:55:19 PST 2024
lukel97 wrote:
We have an additional constraint that C has to be the same as PT though. In order for folding to occur all of these need to be “equivalent”- VADD’s passthru- VMERGE’s passthru- VMERGES’s falseThe code structure doesn’t make this particularly clear, and by “equivalent” I mean they all need to be undef or equal to VMERGE’s false IIUC. So we always transform from 3 distinct input registers to 3 input registers: A, B, and VMERGE’s false. So in the example below, C = PT so the two results would be the same. On 20 Jan 2024, at 02:21, Philip Reames ***@***.***> wrote:
Haven't read the patch, but I'm not convinced of the correctness of the justification.
Consider the following case:
VL = 2, SEW = 32, VLEN=128, TU
V1= VADD PT, A, B
// result is [A[0] + B[0], A[1] + B[1], PT[2], PT[3]]
VL=4
V2 = VMERGE undef, V1, C, <true, true, false, true>
// result is [A[0] + B[0], A[1] + B[1], C[2], PT[3]]
The transform would create:
VL = 2, SEW = 32, VLEN=128, TU/MU
V1= VADD C, A, B, <true, true, false, true>
// result is [A[0] + B[0], A[1] + B[1], C[2], C[3]]
Note the difference in the last element.
Not sure I got my example entirely correct, but the basic idea I'm aiming for is that if the original sequence produces values which consume 4 distinct input registers, there's no folded form for the VADD which can do the same. Unless I'm missing something?
—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you authored the thread.Message ID: ***@***.***>
https://github.com/llvm/llvm-project/pull/78565
More information about the llvm-commits
mailing list