[PATCH][AArch64] implement aarch64 neon instruction class AdvSIMD (3 diff)
Jiangning Liu
liujiangning1 at gmail.com
Mon Aug 26 03:13:35 PDT 2013
Hi Tim,
int_arm_neon_vaddhn/int_arm_neon_vmulls and friends are all defined by ARM
target. I agree with your comments, but does it imply ARM back-end has
inappropriate implementation?
Thanks,
-Jiangning
2013/8/26 Tim Northover <t.p.northover at gmail.com>
> Hi Jiangning,
>
> I've just looked at the LLVM patch for now, since the comments may
> drastically change the Clang patch.
>
> + def _8h8b
> [...]
> + def _8H
>
> It would be nice to settle on a single naming convention for these
> instructions. Personally, I think I prefer the first, but I don't have
> a strong opinion either way.
>
> +defm SADDWvvv : NeonI_3VDW_s<0b0, 0b0001, "saddw", add, 1>;
> +defm UADDWvvv : NeonI_3VDW_u<0b1, 0b0001, "uaddw", add, 1>;
> +
> +defm SADDW2vvv : NeonI_3VDW2_s<0b0, 0b0001, "saddw2", add, 1>;
> +defm UADDW2vvv : NeonI_3VDW2_u<0b1, 0b0001, "uaddw2", add, 1>;
>
> I don't think any widening instructions are commutable. The addition
> part is, but the widening only happens to the RHS. You can't swap Rn
> and Rm on the instructions and get the same result.
>
> +defm ADDHNvvv : NeonI_3VDN_2Op<0b0, 0b0100, "addhn",
> int_arm_neon_vaddhn, 1>;
> +defm RADDHNvvv : NeonI_3VDN_2Op<0b1, 0b0100, "raddhn",
> int_arm_neon_vraddhn, 1>;
>
> Don't these have reasonably simple LLVM IR representations? For example:
>
> define <2 x i32> @addhn(<2 x i64> %lhs, <2 x i64> %rhs) {
> %sum = add <2 x i64> %lhs, %rhs
> %shift = shl <2 x i64> %sum, <i64 32, i64 32>
> %trunc = trunc <2 x i64> %shift to <2 x i32>
> ret <2 x i32> %trunc
> }
>
> define <2 x i32> @raddhn(<2 x i64> %lhs, <2 x i64> %rhs) {
> %sum = add <2 x i64> %lhs, %rhs
> %rounded = add <2 x i64> %sum, <i64 0x80000000, i64 0x80000000>
> %shift = shl <2 x i64> %rounded, <i64 32, i64 32>
> %trunc = trunc <2 x i64> %shift to <2 x i32>
> ret <2 x i32> %trunc
> }
>
> +defm SMULLvvv : NeonI_3VDL_2Op<0b0, 0b1100, "smull",
> int_arm_neon_vmulls, 1>;
> +defm UMULLvvv : NeonI_3VDL_2Op<0b1, 0b1100, "umull",
> int_arm_neon_vmullu, 1>;
>
> Aren't these even simpler than addhn and friends? An extend followed
> by a multiply? They're also always commutable so it probably doesn't
> need to be a template parameter (same for sabdl and uabdl).
>
> +defm SQDMLALvvv : NeonI_3VDL_3Op_v2<0b0, 0b1001, "sqdmlal",
> + int_arm_neon_vqdmlal>;
>
> The qdmlals are just qdmulls with an extra addition, I think.
>
> Cheers.
>
> Tim.
>
--
Thanks,
-Jiangning
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20130826/85c969f2/attachment.html>
More information about the cfe-commits
mailing list