[llvm] [RISCV] Improve lowering of spread(2) shuffles (PR #118658)

Philip Reames via llvm-commits llvm-commits at lists.llvm.org
Wed Dec 4 07:52:56 PST 2024


================
@@ -242,33 +242,27 @@ define <64 x float> @interleave_v32f32(<32 x float> %x, <32 x float> %y) {
 ; V128-NEXT:    slli a0, a0, 3
 ; V128-NEXT:    sub sp, sp, a0
 ; V128-NEXT:    .cfi_escape 0x0f, 0x0d, 0x72, 0x00, 0x11, 0x10, 0x22, 0x11, 0x08, 0x92, 0xa2, 0x38, 0x00, 0x1e, 0x22 # sp + 16 + 8 * vlenb
-; V128-NEXT:    vmv8r.v v24, v16
-; V128-NEXT:    vmv8r.v v16, v8
-; V128-NEXT:    vmv8r.v v8, v24
 ; V128-NEXT:    addi a0, sp, 16
-; V128-NEXT:    vs8r.v v24, (a0) # Unknown-size Folded Spill
+; V128-NEXT:    vs8r.v v8, (a0) # Unknown-size Folded Spill
 ; V128-NEXT:    vsetivli zero, 16, e32, m8, ta, ma
-; V128-NEXT:    vslidedown.vi v0, v24, 16
-; V128-NEXT:    li a0, -1
-; V128-NEXT:    vsetivli zero, 16, e32, m4, ta, ma
-; V128-NEXT:    vwaddu.vv v24, v8, v0
-; V128-NEXT:    vwmaccu.vx v24, a0, v0
-; V128-NEXT:    vsetivli zero, 16, e32, m8, ta, ma
-; V128-NEXT:    vslidedown.vi v0, v16, 16
+; V128-NEXT:    vslidedown.vi v24, v16, 16
+; V128-NEXT:    li a0, 32
+; V128-NEXT:    vslidedown.vi v0, v8, 16
 ; V128-NEXT:    lui a1, 699051
-; V128-NEXT:    li a2, 32
-; V128-NEXT:    vsetivli zero, 16, e32, m4, ta, ma
-; V128-NEXT:    vwaddu.vv v8, v0, v16
+; V128-NEXT:    vsetivli zero, 16, e64, m8, ta, ma
+; V128-NEXT:    vzext.vf2 v8, v24
+; V128-NEXT:    vzext.vf2 v24, v0
 ; V128-NEXT:    addi a1, a1, -1366
 ; V128-NEXT:    vmv.s.x v0, a1
-; V128-NEXT:    vwmaccu.vx v8, a0, v16
-; V128-NEXT:    vsetvli zero, a2, e32, m8, ta, ma
-; V128-NEXT:    vmerge.vvm v24, v8, v24, v0
-; V128-NEXT:    addi a1, sp, 16
-; V128-NEXT:    vl8r.v v8, (a1) # Unknown-size Folded Reload
+; V128-NEXT:    vsll.vx v8, v8, a0
+; V128-NEXT:    vsetvli zero, a0, e32, m8, ta, ma
+; V128-NEXT:    vmerge.vvm v24, v24, v8, v0
----------------
preames wrote:

The weird two spread(2) and then merge - which could be one interleave - comes from lowering this:
`t24: v32i32 = vector_shuffle<16,48,17,49,18,50,19,51,20,52,21,53,22,54,23,55,24,56,25,57,26,58,27,59,28,60,29,61,30,62,31,63> t4, t7`

This fails our current restrictions in the definition of isInterleaveShuffle, but we could probably relax that.  I'm going to glance at that separately.  

https://github.com/llvm/llvm-project/pull/118658


More information about the llvm-commits mailing list