[llvm] [SLP]Initial compatibility support for shl v, 1 and add v, v (PR #181168)

Sat Mar 7 15:43:15 PST 2026

alexey-bataev wrote:

> Not sure if this case is out of the scope of this MR or not. Similar test case to above: `opt -passes=slp-vectorizer -mtriple=riscv64 -mattr=+m,+v -riscv-v-vector-bits-min=-1 -riscv-v-slp-max-vf=0 -S`
> 
> ```
> define void @vec_add(ptr %dest, ptr %p) {
> entry:
>   %inc0 = getelementptr inbounds i16, ptr %p, i64 1
>   %inc1 = getelementptr inbounds i16, ptr %p, i64 2
>   %inc2 = getelementptr inbounds i16, ptr %p, i64 3
>   %e0 = load i16, ptr %p, align 4
>   %e1 = load i16, ptr %inc0, align 2
>   %e2 = load i16, ptr %inc1, align 2
>   %e3 = load i16, ptr %inc2, align 2
> 
>   %a0 = add i16 %e0, %e0
>   %a1 = shl i16 %e2, 1
>   %a2 = shl i16 %e2, 1
>   %a3 = shl i16 %e2, 1
> 
>   %inc4 = getelementptr inbounds i16, ptr %dest, i64 1
>   %inc5 = getelementptr inbounds i16, ptr %dest, i64 2
>   %inc6 = getelementptr inbounds i16, ptr %dest, i64 3
> 
>   store i16 %a0, ptr %dest, align 4
>   store i16 %a1, ptr %inc4, align 2
>   store i16 %a2, ptr %inc5, align 2
>   store i16 %a3, ptr %inc6, align 2
>   ret void
> }
> ```
> 
> I noticed that we create/cost the same gather twice:
> 
> ```
> 2.
> Scalars:
>     %e0 = load i16, ptr %p, align 4
>     %e2 = load i16, ptr %inc1, align 2
> State: NeedToGather
> MainOp:   %e0 = load i16, ptr %p, align 4
> AltOp:   %e0 = load i16, ptr %p, align 4
> VectorizedValue: NULL
> ReuseShuffleIndices: 0, 1, 1, 1,
> ReorderIndices:
> UserTreeIndex: {User:1 EdgeIdx:0}
> 3.
> Scalars:
>     %e0 = load i16, ptr %p, align 4
>     %e2 = load i16, ptr %inc1, align 2
> State: NeedToGather
> MainOp:   %e0 = load i16, ptr %p, align 4
> AltOp:   %e0 = load i16, ptr %p, align 4
> VectorizedValue: NULL
> ReuseShuffleIndices: 0, 1, 1, 1,
> ReorderIndices:
> UserTreeIndex: {User:1 EdgeIdx:1}
> ```
> 
> which causes the vectorization to be unprofitable.
> 
> ```
> SLP: Adding cost -3 for bundle Idx: 1, n=4 [  %a0 = add i16 %e0, %e0, ..].
> SLP: Current total cost = -6
> SLP: perfect diamond match for gather bundle Idx: 2, n=2 [  %e0 = load i16, ptr %p, align 4, ..].
> SLP: Adding cost 4 for bundle Idx: 2, n=2 [  %e0 = load i16, ptr %p, align 4, ..].
> SLP: Current total cost = -2
> SLP: perfect diamond match for gather bundle Idx: 3, n=2 [  %e0 = load i16, ptr %p, align 4, ..].
> SLP: Adding cost 4 for bundle Idx: 3, n=2 [  %e0 = load i16, ptr %p, align 4, ..].
> SLP: Current total cost = 2
> ```

Fixed in a separate patch, see test/Transforms/SLPVectorizer/RISCV/same-node-reused.ll updates

https://github.com/llvm/llvm-project/pull/181168