[PATCH] [AArch64] Add v8.1a RDMA extension

Vladimir Sukharev vladimir.sukharev at arm.com
Wed Mar 4 13:48:54 PST 2015

Hi Tim, 
after deep research I've come up with usage of v1i8, v1i16, v1i32 as valid contents of FPR8, FPR12, FPR32.
Only this way I could perform fusion of intrinsics sqadd and sqrdmulh, with v1i16 as operands, intermediate and result type.
In fact, even for current neon scalar instructions sqadd and sqrdmulh, v1i16 types should be used. Instead, f16 type, enclosed by FPR16 is used. That results in tricky/hacky dag matching, in this way

  f16 SQRDMULH(f16,f16,f16);
  v1i16 op1;
  v1i16 op2;
  v1i16 result = cast(v1i16, SQRDMULH(cast(f16, op1), cast(f16, op2))

For single instruction it works, but I couldn't found a way to fuse two instrs, not using explicit v1i16 type.

Now locally I have the following

  diff --git a/lib/Target/AArch64/AArch64RegisterInfo.td b/lib/Target/AArch64/AArch64RegisterInfo.td
  index d5ff3f1..628e9c7 100644
  --- a/lib/Target/AArch64/AArch64RegisterInfo.td
  +++ b/lib/Target/AArch64/AArch64RegisterInfo.td
  @@ -382,13 +382,13 @@ def Q30   : AArch64Reg<30, "q30", [D30], ["v30", ""]>, DwarfRegAlias<B30>;
   def Q31   : AArch64Reg<31, "q31", [D31], ["v31", ""]>, DwarfRegAlias<B31>;
  -def FPR8  : RegisterClass<"AArch64", [untyped], 8, (sequence "B%u", 0, 31)> {
  +def FPR8  : RegisterClass<"AArch64", [untyped, v1i8], 8, (sequence "B%u", 0, 31)> {
     let Size = 8;
  -def FPR16 : RegisterClass<"AArch64", [f16], 16, (sequence "H%u", 0, 31)> {
  +def FPR16 : RegisterClass<"AArch64", [f16, v1i16], 16, (sequence "H%u", 0, 31)> {
     let Size = 16;
  -def FPR32 : RegisterClass<"AArch64", [f32, i32], 32,(sequence "S%u", 0, 31)>;
  +def FPR32 : RegisterClass<"AArch64", [f32, i32, v1i32], 32,(sequence "S%u", 0, 31)>;
   def FPR64 : RegisterClass<"AArch64", [f64, i64, v2f32, v1f64, v8i8, v4i16, v2i32,
                                       v1i64, v4f16],
                                       64, (sequence "D%u", 0, 31)>;

plus corresponding changes, polishing and explicit type qualifications added to AArch64InstrFormats.td and AArch64InstrInfo.td. Will submit this refactoring shortly.

I'm aware of recent doubts, whether to accept these types as fully valid ( http://permalink.gmane.org/gmane.comp.compilers.llvm.cvs/175395 ). I guess the time has come for the correct implementation.

I would appreciate any objections/suggestions/other approaches to fuse.
Thanks, Vladimir



More information about the llvm-commits mailing list