[PATCH][AArch64] implement aarch64 neon instruction class AdvSIMD (shift)
Tim Northover
t.p.northover at gmail.com
Fri Aug 30 02:31:53 PDT 2013
Hi Hao,
Thanks for working on these.
> Attached are patches to implement aarch64 neon instruction class AdvSIMD
> (shift), which patches implemented 21 shift instructions and 4 convert
> instructions. Most of them are implemented like ARMv7, except :
I'm not convinced ARM does this a good way. It looks like Clang sees
int32x2 vrshr_n_s32(int32x2_t a, int32_t amt)
converts it to:
<2 x i32> @llvm.arm.neon.vrshifts.v2i32(<2 x i32> %a, <2 x i32>
<i32 %amt, i32 %amt>)
The backend then pattern-matches the second operand and maps the whole
thing back to a specialised AArch64ISD node very much like clang saw,
which gets selected. This makes sense to me if there's also a register
form and the instruction with an immediate is just an optimisation for
when that shift amount is a known constant, but isn't that only the
case for UQSHL and SQSHL?
The others seem to be immediate-only instructions, so wouldn't it make
sense for Clang to produce:
<2 x i32> @llvm.aarch64.neon.srshr.v2i32(<2 x i32> %a, i32 12)
Then there would be no need for any special handling in
AArch64ISelLowering.cpp and the intrinsic could be matched directly in
TableGen:
def : Pat<(int_aarch64_neon_srshr v2i32:$Rn, imm1_32:$imm),
(SRSHRvvi_2s v2i32:$Rn, imm1_32:$imm)>;
It looks like the only shifts this doesn't apply to is UQSHL/SQSHL
(which do have a register form as well).
> 1) SHRN is implemented by IR (lshr/ashr, tuncate) instead of IR intrinsics.
> 2) There are some special instructions added in AArch64: shift narrow high,
> which is implemented by combining shuffle vector and normal shift narrow
> instructions.
That's good. I like those implementations.
Cheers.
Tim.
More information about the cfe-commits
mailing list