[PATCH] D158195: [RISCV] Combine (vrot{l,r} vxi16, 8) -> vrev8
Luke Lau via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Fri Aug 18 06:24:53 PDT 2023
luke added a comment.
In D158195#4596052 <https://reviews.llvm.org/D158195#4596052>, @craig.topper wrote:
> If the rotate came in as a fshl/fshr intrinsic or as shl+shr+or would we already get vrev8 for fixed vectors? Is only the shuffle case that is being optimized?
Yeah we already get vrev8 for these, DAGCombiner canonicalises them before they would be legalised to vl nodes:
define <4 x i16> @rot_via_fshr(<4 x i16> %a) {
%res = call <4 x i16> @llvm.fshr.v4i16(<4 x i16> %a, <4 x i16> %a, <4 x i16> <i16 8, i16 8, i16 8, i16 8>)
ret <4 x i16> %res
}
declare <4 x i16> @llvm.fshr.v4i16(<4 x i16> %a, <4 x i16> %b, <4 x i16> %c)
define <4 x i16> @rot_via_shift(<4 x i16> %a, <4 x i16> %amt) {
%1 = shl <4 x i16> %a, <i16 8, i16 8, i16 8, i16 8>
%2 = lshr <4 x i16> %a, <i16 8, i16 8, i16 8, i16 8>
%3 = or <4 x i16> %1, %2
ret <4 x i16> %3
}
=== rot_via_fshr
Initial selection DAG: %bb.0 'rot_via_fshr:'
SelectionDAG has 13 nodes:
t0: ch,glue = EntryToken
t2: nxv2i16,ch = CopyFromReg t0, Register:nxv2i16 %0
t4: v4i16 = extract_subvector t2, Constant:i64<0>
t6: v4i16 = BUILD_VECTOR Constant:i16<8>, Constant:i16<8>, Constant:i16<8>, Constant:i16<8>
t7: v4i16 = rotr t4, t6
t9: nxv2i16 = insert_subvector undef:nxv2i16, t7, Constant:i64<0>
t11: ch,glue = CopyToReg t0, Register:nxv2i16 $v8, t9
t12: ch = RISCVISD::RET_GLUE t11, Register:nxv2i16 $v8, t11:1
Optimized lowered selection DAG: %bb.0 'rot_via_fshr:'
SelectionDAG has 11 nodes:
t0: ch,glue = EntryToken
t2: nxv2i16,ch = CopyFromReg t0, Register:nxv2i16 %0
t4: v4i16 = extract_subvector t2, Constant:i64<0>
t13: v4i16 = bswap t4
t9: nxv2i16 = insert_subvector undef:nxv2i16, t13, Constant:i64<0>
t11: ch,glue = CopyToReg t0, Register:nxv2i16 $v8, t9
t12: ch = RISCVISD::RET_GLUE t11, Register:nxv2i16 $v8, t11:1
=== rot_via_shift
Initial selection DAG: %bb.0 'rot_via_shift:'
SelectionDAG has 18 nodes:
t0: ch,glue = EntryToken
t2: nxv2i16,ch = CopyFromReg t0, Register:nxv2i16 %0
t4: v4i16 = extract_subvector t2, Constant:i64<0>
t6: nxv2i16,ch = CopyFromReg t0, Register:nxv2i16 %1
t7: v4i16 = extract_subvector t6, Constant:i64<0>
t9: v4i16 = BUILD_VECTOR Constant:i16<8>, Constant:i16<8>, Constant:i16<8>, Constant:i16<8>
t10: v4i16 = shl t4, t9
t11: v4i16 = srl t4, t9
t12: v4i16 = or t10, t11
t14: nxv2i16 = insert_subvector undef:nxv2i16, t12, Constant:i64<0>
t16: ch,glue = CopyToReg t0, Register:nxv2i16 $v8, t14
t17: ch = RISCVISD::RET_GLUE t16, Register:nxv2i16 $v8, t16:1
Optimized lowered selection DAG: %bb.0 'rot_via_shift:'
SelectionDAG has 11 nodes:
t0: ch,glue = EntryToken
t2: nxv2i16,ch = CopyFromReg t0, Register:nxv2i16 %0
t4: v4i16 = extract_subvector t2, Constant:i64<0>
t19: v4i16 = bswap t4
t14: nxv2i16 = insert_subvector undef:nxv2i16, t19, Constant:i64<0>
t16: ch,glue = CopyToReg t0, Register:nxv2i16 $v8, t14
t17: ch = RISCVISD::RET_GLUE t16, Register:nxv2i16 $v8, t16:1
================
Comment at: llvm/test/CodeGen/RISCV/rvv/fixed-vectors-shuffle-reverse.ll:278
+; ZVBB-NEXT: vsetivli zero, 1, e32, mf2, ta, ma
+; ZVBB-NEXT: vror.vi v8, v8, 16
+; ZVBB-NEXT: ret
----------------
craig.topper wrote:
> How does this patch create new rotates?
Not sure how I didn't notice these. Looks like it always emitted rotates on zvbb, there's just an issue with the filecheck prefixes.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D158195/new/
https://reviews.llvm.org/D158195
More information about the llvm-commits
mailing list