[PATCH][AArch64] implement aarch64 neon instruction class AdvSIMD (3 diff)

Mon Aug 26 03:13:35 PDT 2013

Hi Tim,

int_arm_neon_vaddhn/int_arm_neon_vmulls and friends are all defined by ARM
target. I agree with your comments, but does it imply ARM back-end has
inappropriate implementation?

Thanks,
-Jiangning

2013/8/26 Tim Northover <t.p.northover at gmail.com>

> Hi Jiangning,
>
> I've just looked at the LLVM patch for now, since the comments may
> drastically change the Clang patch.
>
> +  def _8h8b
> [...]
> +    def _8H
>
> It would be nice to settle on a single naming convention for these
> instructions. Personally, I think I prefer the first, but I don't have
> a strong opinion either way.
>
> +defm SADDWvvv :  NeonI_3VDW_s<0b0, 0b0001, "saddw", add, 1>;
> +defm UADDWvvv :  NeonI_3VDW_u<0b1, 0b0001, "uaddw", add, 1>;
> +
> +defm SADDW2vvv :  NeonI_3VDW2_s<0b0, 0b0001, "saddw2", add, 1>;
> +defm UADDW2vvv :  NeonI_3VDW2_u<0b1, 0b0001, "uaddw2", add, 1>;
>
> I don't think any widening instructions are commutable. The addition
> part is, but the widening only happens to the RHS. You can't swap Rn
> and Rm on the instructions and get the same result.
>
> +defm ADDHNvvv  : NeonI_3VDN_2Op<0b0, 0b0100, "addhn",
> int_arm_neon_vaddhn, 1>;
> +defm RADDHNvvv : NeonI_3VDN_2Op<0b1, 0b0100, "raddhn",
> int_arm_neon_vraddhn, 1>;
>
> Don't these have reasonably simple LLVM IR representations? For example:
>
> define <2 x i32> @addhn(<2 x i64> %lhs, <2 x i64> %rhs) {
>   %sum = add <2 x i64> %lhs, %rhs
>   %shift = shl <2 x i64> %sum, <i64 32, i64 32>
>   %trunc = trunc <2 x i64> %shift to <2 x i32>
>   ret <2 x i32> %trunc
> }
>
> define <2 x i32> @raddhn(<2 x i64> %lhs, <2 x i64> %rhs) {
>   %sum = add <2 x i64> %lhs, %rhs
>   %rounded = add <2 x i64> %sum, <i64 0x80000000, i64 0x80000000>
>   %shift = shl <2 x i64> %rounded, <i64 32, i64 32>
>   %trunc = trunc <2 x i64> %shift to <2 x i32>
>   ret <2 x i32> %trunc
> }
>
> +defm SMULLvvv :  NeonI_3VDL_2Op<0b0, 0b1100, "smull",
> int_arm_neon_vmulls, 1>;
> +defm UMULLvvv :  NeonI_3VDL_2Op<0b1, 0b1100, "umull",
> int_arm_neon_vmullu, 1>;
>
> Aren't these even simpler than addhn and friends? An extend followed
> by a multiply? They're also always commutable so it probably doesn't
> need to be a template parameter (same for sabdl and uabdl).
>
> +defm SQDMLALvvv : NeonI_3VDL_3Op_v2<0b0, 0b1001, "sqdmlal",
> +                                    int_arm_neon_vqdmlal>;
>
> The qdmlals are just qdmulls with an extra addition, I think.
>
> Cheers.
>
> Tim.
>

-- 
Thanks,
-Jiangning
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20130826/85c969f2/attachment.html>