[PATCH] D157417: [RISCV][SelectionDAG] Lower shuffles as bitrotates with vror.vi when possible
Craig Topper via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Aug 17 10:27:52 PDT 2023
craig.topper added a comment.
In D157417#4595013 <https://reviews.llvm.org/D157417#4595013>, @luke wrote:
> In D157417#4593411 <https://reviews.llvm.org/D157417#4593411>, @reames wrote:
>
>> Broader point is that maybe we should be doing this even without zbb.
>
> For reference, this is the new sequence we would be generating without zvbb:
>
> define <8 x i16> @shuffle_v8i16_as_i32(<8 x i16> %v) {
> ; CHECK-LABEL: shuffle_v8i16_as_i32:
> ; CHECK: # %bb.0:
> -; CHECK-NEXT: lui a0, %hi(.LCPI18_0)
> -; CHECK-NEXT: addi a0, a0, %lo(.LCPI18_0)
> -; CHECK-NEXT: vsetivli zero, 8, e16, m1, ta, ma
> -; CHECK-NEXT: vle16.v v10, (a0)
> -; CHECK-NEXT: vrgather.vv v9, v8, v10
> -; CHECK-NEXT: vmv.v.v v8, v9
> +; CHECK-NEXT: vsetivli zero, 4, e16, mf2, ta, ma
> +; CHECK-NEXT: vmv.v.i v9, 0
> +; CHECK-NEXT: li a0, 16
> +; CHECK-NEXT: vwsubu.vx v10, v9, a0
> +; CHECK-NEXT: li a1, 31
> +; CHECK-NEXT: vsetvli zero, zero, e32, m1, ta, ma
> +; CHECK-NEXT: vand.vx v9, v10, a1
> +; CHECK-NEXT: vsrl.vv v9, v8, v9
> +; CHECK-NEXT: vmv.v.x v10, a0
> +; CHECK-NEXT: vand.vx v10, v10, a1
> +; CHECK-NEXT: vsll.vv v8, v8, v10
> +; CHECK-NEXT: vor.vv v8, v8, v9
> ; CHECK-NEXT: ret
> ;
> ; ZVBB_V-LABEL: shuffle_v8i16_as_i32:
> ; ZVBB_V: # %bb.0:
> ; ZVBB_V-NEXT: vsetivli zero, 4, e32, m1, ta, ma
> ; ZVBB_V-NEXT: vror.vi v8, v8, 16
> ; ZVBB_V-NEXT: ret
> ;
> ; ZVBB_ZVE32X-LABEL: shuffle_v8i16_as_i32:
> ; ZVBB_ZVE32X: # %bb.0:
> ; ZVBB_ZVE32X-NEXT: vsetivli zero, 4, e32, m4, ta, ma
> ; ZVBB_ZVE32X-NEXT: vror.vi v8, v8, 16
> ; ZVBB_ZVE32X-NEXT: ret
> %shuffle = shufflevector <8 x i16> %v, <8 x i16> poison, <8 x i32> <i32 1, i32 0, i32 3, i32 2, i32 5, i32 4, i32 7, i32 6>
> ret <8 x i16> %shuffle
> }
Why isn't this constant folded
+; CHECK-NEXT: vmv.v.i v9, 0
+; CHECK-NEXT: li a0, 16
+; CHECK-NEXT: vwsubu.vx v10, v9, a0
This and feels unnecessary. This shift only uses the lower 5 bits
+; CHECK-NEXT: li a1, 31
+; CHECK-NEXT: vsetvli zero, zero, e32, m1, ta, ma
+; CHECK-NEXT: vand.vx v9, v10, a1
+; CHECK-NEXT: vsrl.vv v9, v8, v9
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D157417/new/
https://reviews.llvm.org/D157417
More information about the llvm-commits
mailing list