[llvm] [SLP]Initial compatibility support for shl v, 1 and add v, v (PR #181168)
Alexey Bataev via llvm-commits
llvm-commits at lists.llvm.org
Sat Mar 7 15:43:15 PST 2026
alexey-bataev wrote:
> Not sure if this case is out of the scope of this MR or not. Similar test case to above: `opt -passes=slp-vectorizer -mtriple=riscv64 -mattr=+m,+v -riscv-v-vector-bits-min=-1 -riscv-v-slp-max-vf=0 -S`
>
> ```
> define void @vec_add(ptr %dest, ptr %p) {
> entry:
> %inc0 = getelementptr inbounds i16, ptr %p, i64 1
> %inc1 = getelementptr inbounds i16, ptr %p, i64 2
> %inc2 = getelementptr inbounds i16, ptr %p, i64 3
> %e0 = load i16, ptr %p, align 4
> %e1 = load i16, ptr %inc0, align 2
> %e2 = load i16, ptr %inc1, align 2
> %e3 = load i16, ptr %inc2, align 2
>
> %a0 = add i16 %e0, %e0
> %a1 = shl i16 %e2, 1
> %a2 = shl i16 %e2, 1
> %a3 = shl i16 %e2, 1
>
> %inc4 = getelementptr inbounds i16, ptr %dest, i64 1
> %inc5 = getelementptr inbounds i16, ptr %dest, i64 2
> %inc6 = getelementptr inbounds i16, ptr %dest, i64 3
>
> store i16 %a0, ptr %dest, align 4
> store i16 %a1, ptr %inc4, align 2
> store i16 %a2, ptr %inc5, align 2
> store i16 %a3, ptr %inc6, align 2
> ret void
> }
> ```
>
> I noticed that we create/cost the same gather twice:
>
> ```
> 2.
> Scalars:
> %e0 = load i16, ptr %p, align 4
> %e2 = load i16, ptr %inc1, align 2
> State: NeedToGather
> MainOp: %e0 = load i16, ptr %p, align 4
> AltOp: %e0 = load i16, ptr %p, align 4
> VectorizedValue: NULL
> ReuseShuffleIndices: 0, 1, 1, 1,
> ReorderIndices:
> UserTreeIndex: {User:1 EdgeIdx:0}
> 3.
> Scalars:
> %e0 = load i16, ptr %p, align 4
> %e2 = load i16, ptr %inc1, align 2
> State: NeedToGather
> MainOp: %e0 = load i16, ptr %p, align 4
> AltOp: %e0 = load i16, ptr %p, align 4
> VectorizedValue: NULL
> ReuseShuffleIndices: 0, 1, 1, 1,
> ReorderIndices:
> UserTreeIndex: {User:1 EdgeIdx:1}
> ```
>
> which causes the vectorization to be unprofitable.
>
> ```
> SLP: Adding cost -3 for bundle Idx: 1, n=4 [ %a0 = add i16 %e0, %e0, ..].
> SLP: Current total cost = -6
> SLP: perfect diamond match for gather bundle Idx: 2, n=2 [ %e0 = load i16, ptr %p, align 4, ..].
> SLP: Adding cost 4 for bundle Idx: 2, n=2 [ %e0 = load i16, ptr %p, align 4, ..].
> SLP: Current total cost = -2
> SLP: perfect diamond match for gather bundle Idx: 3, n=2 [ %e0 = load i16, ptr %p, align 4, ..].
> SLP: Adding cost 4 for bundle Idx: 3, n=2 [ %e0 = load i16, ptr %p, align 4, ..].
> SLP: Current total cost = 2
> ```
Fixed in a separate patch, see test/Transforms/SLPVectorizer/RISCV/same-node-reused.ll updates
https://github.com/llvm/llvm-project/pull/181168
More information about the llvm-commits
mailing list