[PATCH] D158195: [RISCV] Combine (vrot{l,r} vxi16, 8) -> vrev8
Philip Reames via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Fri Aug 18 12:50:10 PDT 2023
reames added a comment.
In D158195#4599594 <https://reviews.llvm.org/D158195#4599594>, @craig.topper wrote:
> In D158195#4598747 <https://reviews.llvm.org/D158195#4598747>, @luke wrote:
>
>> In D158195#4596052 <https://reviews.llvm.org/D158195#4596052>, @craig.topper wrote:
>>
>>> If the rotate came in as a fshl/fshr intrinsic or as shl+shr+or would we already get vrev8 for fixed vectors? Is only the shuffle case that is being optimized?
>>
>> Yeah we already get vrev8 for these, DAGCombiner canonicalises them before they would be legalised to vl nodes:
>>
>> define <4 x i16> @rot_via_fshr(<4 x i16> %a) {
>> %res = call <4 x i16> @llvm.fshr.v4i16(<4 x i16> %a, <4 x i16> %a, <4 x i16> <i16 8, i16 8, i16 8, i16 8>)
>> ret <4 x i16> %res
>> }
>>
>> declare <4 x i16> @llvm.fshr.v4i16(<4 x i16> %a, <4 x i16> %b, <4 x i16> %c)
>>
>> define <4 x i16> @rot_via_shift(<4 x i16> %a, <4 x i16> %amt) {
>> %1 = shl <4 x i16> %a, <i16 8, i16 8, i16 8, i16 8>
>> %2 = lshr <4 x i16> %a, <i16 8, i16 8, i16 8, i16 8>
>> %3 = or <4 x i16> %1, %2
>> ret <4 x i16> %3
>> }
>>
>>
>>
>> === rot_via_fshr
>> Initial selection DAG: %bb.0 'rot_via_fshr:'
>> SelectionDAG has 13 nodes:
>> t0: ch,glue = EntryToken
>> t2: nxv2i16,ch = CopyFromReg t0, Register:nxv2i16 %0
>> t4: v4i16 = extract_subvector t2, Constant:i64<0>
>> t6: v4i16 = BUILD_VECTOR Constant:i16<8>, Constant:i16<8>, Constant:i16<8>, Constant:i16<8>
>> t7: v4i16 = rotr t4, t6
>> t9: nxv2i16 = insert_subvector undef:nxv2i16, t7, Constant:i64<0>
>> t11: ch,glue = CopyToReg t0, Register:nxv2i16 $v8, t9
>> t12: ch = RISCVISD::RET_GLUE t11, Register:nxv2i16 $v8, t11:1
>>
>>
>> Optimized lowered selection DAG: %bb.0 'rot_via_fshr:'
>> SelectionDAG has 11 nodes:
>> t0: ch,glue = EntryToken
>> t2: nxv2i16,ch = CopyFromReg t0, Register:nxv2i16 %0
>> t4: v4i16 = extract_subvector t2, Constant:i64<0>
>> t13: v4i16 = bswap t4
>> t9: nxv2i16 = insert_subvector undef:nxv2i16, t13, Constant:i64<0>
>> t11: ch,glue = CopyToReg t0, Register:nxv2i16 $v8, t9
>> t12: ch = RISCVISD::RET_GLUE t11, Register:nxv2i16 $v8, t11:1
>>
>>
>>
>> === rot_via_shift
>> Initial selection DAG: %bb.0 'rot_via_shift:'
>> SelectionDAG has 18 nodes:
>> t0: ch,glue = EntryToken
>> t2: nxv2i16,ch = CopyFromReg t0, Register:nxv2i16 %0
>> t4: v4i16 = extract_subvector t2, Constant:i64<0>
>> t6: nxv2i16,ch = CopyFromReg t0, Register:nxv2i16 %1
>> t7: v4i16 = extract_subvector t6, Constant:i64<0>
>> t9: v4i16 = BUILD_VECTOR Constant:i16<8>, Constant:i16<8>, Constant:i16<8>, Constant:i16<8>
>> t10: v4i16 = shl t4, t9
>> t11: v4i16 = srl t4, t9
>> t12: v4i16 = or t10, t11
>> t14: nxv2i16 = insert_subvector undef:nxv2i16, t12, Constant:i64<0>
>> t16: ch,glue = CopyToReg t0, Register:nxv2i16 $v8, t14
>> t17: ch = RISCVISD::RET_GLUE t16, Register:nxv2i16 $v8, t16:1
>>
>>
>> Optimized lowered selection DAG: %bb.0 'rot_via_shift:'
>> SelectionDAG has 11 nodes:
>> t0: ch,glue = EntryToken
>> t2: nxv2i16,ch = CopyFromReg t0, Register:nxv2i16 %0
>> t4: v4i16 = extract_subvector t2, Constant:i64<0>
>> t19: v4i16 = bswap t4
>> t14: nxv2i16 = insert_subvector undef:nxv2i16, t19, Constant:i64<0>
>> t16: ch,glue = CopyToReg t0, Register:nxv2i16 $v8, t14
>> t17: ch = RISCVISD::RET_GLUE t16, Register:nxv2i16 $v8, t16:1
>
> How ugly would it be to do it as a special case during the shuffle lowering instead?
Another possibility would be a RISCV shuffle to bswap combine before lowering, but having this be a special case in the lowering doesn't seem bad to me.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D158195/new/
https://reviews.llvm.org/D158195
More information about the llvm-commits
mailing list