[llvm] [RISCV] Decompose LMUL > 1 reverses into LMUL * M1 vrgather.vv (PR #104574)
Philip Reames via llvm-commits
llvm-commits at lists.llvm.org
Fri Aug 16 09:43:47 PDT 2024
================
@@ -1216,18 +1432,27 @@ define <4 x double> @reverse_v4f64_2(<2 x double> %a, < 2 x double> %b) {
define <8 x double> @reverse_v8f64_2(<4 x double> %a, <4 x double> %b) {
; CHECK-LABEL: reverse_v8f64_2:
; CHECK: # %bb.0:
-; CHECK-NEXT: vmv2r.v v16, v10
-; CHECK-NEXT: vsetivli zero, 8, e16, m1, ta, ma
-; CHECK-NEXT: vid.v v10
-; CHECK-NEXT: vrsub.vi v11, v10, 7
-; CHECK-NEXT: vsetvli zero, zero, e64, m4, ta, ma
-; CHECK-NEXT: vrgatherei16.vv v12, v8, v11
+; CHECK-NEXT: csrr a0, vlenb
+; CHECK-NEXT: srli a1, a0, 3
+; CHECK-NEXT: addi a1, a1, -1
+; CHECK-NEXT: vsetvli a2, zero, e64, m1, ta, ma
+; CHECK-NEXT: vid.v v12
+; CHECK-NEXT: vrsub.vx v12, v12, a1
+; CHECK-NEXT: vrgather.vv v19, v8, v12
+; CHECK-NEXT: vrgather.vv v18, v9, v12
+; CHECK-NEXT: vrgather.vv v16, v8, v12
+; CHECK-NEXT: vmv2r.v v12, v10
+; CHECK-NEXT: vmv.v.v v17, v16
+; CHECK-NEXT: srli a0, a0, 1
+; CHECK-NEXT: addi a0, a0, -8
+; CHECK-NEXT: vsetivli zero, 8, e64, m4, ta, ma
+; CHECK-NEXT: vslidedown.vx v8, v16, a0
----------------
preames wrote:
In the fixed vector tests, we should have an exact VLEN test for (a) precisely full registers (where the vslidedown can be pruned entirely), and (b) a vector prefix (where the slide amount is a known constant). I skimmed and didn't see these, but there's enough test churn I may have just missed them.
https://github.com/llvm/llvm-project/pull/104574
More information about the llvm-commits
mailing list