[PATCH] D158195: [RISCV] Combine (vrot{l,r} vxi16, 8) -> vrev8
Craig Topper via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Fri Aug 18 10:47:24 PDT 2023
craig.topper added a comment.
In D158195#4598747 <https://reviews.llvm.org/D158195#4598747>, @luke wrote:
> In D158195#4596052 <https://reviews.llvm.org/D158195#4596052>, @craig.topper wrote:
>
>> If the rotate came in as a fshl/fshr intrinsic or as shl+shr+or would we already get vrev8 for fixed vectors? Is only the shuffle case that is being optimized?
>
> Yeah we already get vrev8 for these, DAGCombiner canonicalises them before they would be legalised to vl nodes:
>
> define <4 x i16> @rot_via_fshr(<4 x i16> %a) {
> %res = call <4 x i16> @llvm.fshr.v4i16(<4 x i16> %a, <4 x i16> %a, <4 x i16> <i16 8, i16 8, i16 8, i16 8>)
> ret <4 x i16> %res
> }
>
> declare <4 x i16> @llvm.fshr.v4i16(<4 x i16> %a, <4 x i16> %b, <4 x i16> %c)
>
> define <4 x i16> @rot_via_shift(<4 x i16> %a, <4 x i16> %amt) {
> %1 = shl <4 x i16> %a, <i16 8, i16 8, i16 8, i16 8>
> %2 = lshr <4 x i16> %a, <i16 8, i16 8, i16 8, i16 8>
> %3 = or <4 x i16> %1, %2
> ret <4 x i16> %3
> }
>
>
>
> === rot_via_fshr
> Initial selection DAG: %bb.0 'rot_via_fshr:'
> SelectionDAG has 13 nodes:
> t0: ch,glue = EntryToken
> t2: nxv2i16,ch = CopyFromReg t0, Register:nxv2i16 %0
> t4: v4i16 = extract_subvector t2, Constant:i64<0>
> t6: v4i16 = BUILD_VECTOR Constant:i16<8>, Constant:i16<8>, Constant:i16<8>, Constant:i16<8>
> t7: v4i16 = rotr t4, t6
> t9: nxv2i16 = insert_subvector undef:nxv2i16, t7, Constant:i64<0>
> t11: ch,glue = CopyToReg t0, Register:nxv2i16 $v8, t9
> t12: ch = RISCVISD::RET_GLUE t11, Register:nxv2i16 $v8, t11:1
>
>
> Optimized lowered selection DAG: %bb.0 'rot_via_fshr:'
> SelectionDAG has 11 nodes:
> t0: ch,glue = EntryToken
> t2: nxv2i16,ch = CopyFromReg t0, Register:nxv2i16 %0
> t4: v4i16 = extract_subvector t2, Constant:i64<0>
> t13: v4i16 = bswap t4
> t9: nxv2i16 = insert_subvector undef:nxv2i16, t13, Constant:i64<0>
> t11: ch,glue = CopyToReg t0, Register:nxv2i16 $v8, t9
> t12: ch = RISCVISD::RET_GLUE t11, Register:nxv2i16 $v8, t11:1
>
>
>
> === rot_via_shift
> Initial selection DAG: %bb.0 'rot_via_shift:'
> SelectionDAG has 18 nodes:
> t0: ch,glue = EntryToken
> t2: nxv2i16,ch = CopyFromReg t0, Register:nxv2i16 %0
> t4: v4i16 = extract_subvector t2, Constant:i64<0>
> t6: nxv2i16,ch = CopyFromReg t0, Register:nxv2i16 %1
> t7: v4i16 = extract_subvector t6, Constant:i64<0>
> t9: v4i16 = BUILD_VECTOR Constant:i16<8>, Constant:i16<8>, Constant:i16<8>, Constant:i16<8>
> t10: v4i16 = shl t4, t9
> t11: v4i16 = srl t4, t9
> t12: v4i16 = or t10, t11
> t14: nxv2i16 = insert_subvector undef:nxv2i16, t12, Constant:i64<0>
> t16: ch,glue = CopyToReg t0, Register:nxv2i16 $v8, t14
> t17: ch = RISCVISD::RET_GLUE t16, Register:nxv2i16 $v8, t16:1
>
>
> Optimized lowered selection DAG: %bb.0 'rot_via_shift:'
> SelectionDAG has 11 nodes:
> t0: ch,glue = EntryToken
> t2: nxv2i16,ch = CopyFromReg t0, Register:nxv2i16 %0
> t4: v4i16 = extract_subvector t2, Constant:i64<0>
> t19: v4i16 = bswap t4
> t14: nxv2i16 = insert_subvector undef:nxv2i16, t19, Constant:i64<0>
> t16: ch,glue = CopyToReg t0, Register:nxv2i16 $v8, t14
> t17: ch = RISCVISD::RET_GLUE t16, Register:nxv2i16 $v8, t16:1
How ugly would it be to do it as a special case during the shuffle lowering instead?
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D158195/new/
https://reviews.llvm.org/D158195
More information about the llvm-commits
mailing list