[PATCH] [AArch64] Add v8.1a RDMA extension

Tim Northover t.p.northover at gmail.com
Wed Mar 4 13:45:59 PST 2015


On 3 March 2015 at 13:00, Vladimir Sukharev <vladimir.sukharev at arm.com> wrote:
> Hi Tim,
> thank you for your warm feedback.
>
> 1. **I don't think these new intrinsics are needed. The instructions are effectively "(int_aarch64_neon_sqadd $acc, (int_aarch64_neon_sqrdmulh $LHS, $RHS))".**
>
> Ok, but now I have a severe trouble that could be well-known for v1iNN types. If so, would you please give some hint?
> [...]
>   def : Pat<(i16 (int_aarch64_neon_sqadd (i16 FPR16:$Rd),
>                     (i16 (int_aarch64_neon_sqrdmulh (i16 FPR16:$Rn),
>                                                       (i16 FPR16:$Rm))))),

Most i16 intrinsics don't have patterns yet, because i16 isn't a legal
AArch64 type. Clang currently generates code like this (using vector
ops) to get those semantics:

define signext i16 @foo(i16 signext %l, i16 signext %r) #0 {
  %1 = insertelement <4 x i16> undef, i16 %l, i64 0
  %2 = insertelement <4 x i16> undef, i16 %r, i64 0
  %3 = tail call <4 x i16> @llvm.aarch64.neon.sqadd.v4i16(<4 x i16>
%1, <4 x i16> %2) #2
  %4 = extractelement <4 x i16> %3, i64 0
  ret i16 %4
}

> Note: First I tried another way of implementation, like the following snipped for vector variant. It has failed due to incomplete type inference between two intrinsics.

That looks like an annoying shortcoming in TableGen's type inference,
you often need to be more explicit than you'd hope when specifying the
types of trees involving intrinsics.

Annoyingly, I think you sometimes need to create a new multiclass
hierarchy to insert the needed types. We should probably fix that some
time.

Cheers.

Tim.



More information about the llvm-commits mailing list