[PATCH] [AArch64] Add v8.1a "Rounding Doubling Multiply Add/Subtract" extension

Thu Mar 26 12:45:04 PDT 2015

REPOSITORY
  rL LLVM

================
Comment at: lib/Target/AArch64/AArch64InstrFormats.td:8555
@@ -8523,1 +8554,3 @@
 //----------------------------------------------------------------------------
+// AdvSIMD v8.1 Rounding Double Multiply Add/Subtract
+//----------------------------------------------------------------------------
----------------
jmolloy wrote:
> I think this should say "Doubling", right?
Sorry, will be changed in next revision.

================
Comment at: lib/Target/AArch64/AArch64InstrFormats.td:8607
@@ +8606,3 @@
+  }
+  // FIXME: uncomment the following, after backend will support i16 neon type
+  //def : Pat<(i16 (Accum (i16 FPR16Op:$Rd),
----------------
jmolloy wrote:
> We don't allow commented out code, sorry. You should also be able to enable this pattern, by using an SMOV instruction (you'd need a UMOV for the equivalent unsigned operation):
> 
>     def : Pat<(... stays the same ...),
>                   (SMOVvi16to32 (!cast<Instruction>(NAME # v4i16_indexed)
>                     ..., VectorIndexH:0)>;
> 
> But the commented out pattern seems weird; it uses SUBREG_TO_REG, which I don't understand why, and it matches sqdmull, not sqrdmulh like the surrounding code. So something's fishy here.
Oops, good catch.
That's the correct pattern that is supposed to be here, but cannot be compiled due to problem with first part(matching), not with second part(generating)
```
  // FIXME: this cannot be processed by TableGen
  // error: In SQRDMLAHanonymous_913: Type inference contradiction found, 
  // merging '{i32:i64:v8i8:v16i8:v4i16:v8i16:v2i32:v4i32:v1i64:v2i64}' into 'i16'
  // error: In SQRDMLAHanonymous_913: Type inference contradiction found, merging 'i16' into 'f16'
  //def : Pat<(i16 (Accum (i16 FPR16Op:$Rd),
  //                      (i16 (vector_extract
  //                             (v8i16 (insert_subvector
  //                                      (undef),
  //                                      (v4i16 (int_aarch64_neon_sqrdmulh 
  //                                               (v4i16 V64:$Rn),
  //                                               (v4i16 (AArch64duplane16 
  //                                                        (v8i16 V128_lo:$Rm),
  //                                                        VectorIndexH:$idx)))),
  //                                    (i32 0))),
  //                             (i64 0))))),
  //          (EXTRACT_SUBREG
  //              (v4i16 (!cast<Instruction>(NAME # v4i16_indexed)
  //                        (v4i16 (INSERT_SUBREG (v4i16 (IMPLICIT_DEF)),
  //                                              FPR16Op:$Rd,
  //                                              ssub)),
  //                        V64:$Rn,
  //                        V128_lo:$Rm, 
  //                        VectorIndexH:$idx)),
  //              ssub)>;
```
Test for it: test_sqrdmlsh_extracted_lane_s16 (see below)

================
Comment at: lib/Target/AArch64/AArch64InstrFormats.td:8636
@@ +8635,3 @@
+  // after backend will support i16 neon type
+
+  def v2i32_indexed : BaseSIMDIndexedTied<0, U, 0, 0b10, opc,
----------------
Also, that's a non-compilable pattern, supposed to be here and tested by test_sqrdmlahq_extracted_lane_s16
```
  // FIXME: this cannot be processed by TableGen
  // error: In SQRDMLAHanonymous_913: Type inference contradiction found, 
  // merging '{i32:i64:v8i8:v16i8:v4i16:v8i16:v2i32:v4i32:v1i64:v2i64}' into 'i16'
  // error: In SQRDMLAHanonymous_913: Type inference contradiction found, merging 'i16' into 'f16'
  //def : Pat<(i16 (Accum (i16 FPR16Op:$Rd),
  //                      (i16 (vector_extract 
  //                             (v8i16 (int_aarch64_neon_sqrdmulh 
  //                                      (v8i16 V128:$Rn),
  //                                      (v8i16 (AArch64duplane16 
  //                                               (v8i16 V128_lo:$Rm),
  //                                               VectorIndexH:$idx)))),
  //                             (i64 0))))),
  //          (EXTRACT_SUBREG
  //              (v8i16 (!cast<Instruction>(NAME # v8i16_indexed)
  //                        (v8i16 (INSERT_SUBREG (v8i16 (IMPLICIT_DEF)),
  //                                              FPR16Op:$Rd, 
  //                                              ssub)), 
  //                        V128:$Rn,
  //                        V128_lo:$Rm, 
  //                        VectorIndexH:$idx)),
  //              ssub)>;
```

================
Comment at: lib/Target/AArch64/AArch64InstrFormats.td:8657
@@ +8656,3 @@
+                        (i32 (vector_extract 
+                               (v4i32 (insert_subvector
+                                        (undef), 
----------------
jmolloy wrote:
> This is really weird and sounds like a bug, although if the pattern matches I can't really argue with it as it means the bug is somewhere else...
> 
> ... I assume this pattern is explicitly tested?
weird extra node (v4i32 (insert_subvector (undef),(2i32...   is inserted to this DAG, because extact_subvector is illegal from 2i32. It is legal only from 4i32. 
That could be a bug of higher design level, do you have any thoughts?

Meanwhile this pattern successully matches DAG, that we have for explicit test  "test_sqrdmlah_extracted_lane_s32" (see comment below)


================
Comment at: lib/Target/AArch64/AArch64InstrFormats.td:8714
@@ +8713,3 @@
+    [
+  // FIXME: uncomment the following, after backend will support i16 neon type
+  //  (set (i16 FPR16Op:$dst),
----------------
jmolloy wrote:
> You should be able to implement this with SMOV/UMOV, as I mentioned above.
As I commented above, it's a problem of another kind: namely, Tablegen cannot generate matcher for snippet 
```
Accum (i16 FPR16Op:$Rd), (i16 (int_aarch64_neon_sqrdmulh....

```
because of error
```
SQRDMLAHi16_indexed: 	(set FPR16Op:i16:$dst, (intrinsic_wo_chain:{i32:i64:v8i8:v16i8:v4i16:v8i16:v2i32:v4i32:v1i64:v2i64} 117:iPTR, FPR16Op:<empty>:$Rd, (intrinsic_wo_chain:i16 122:<empty>, FPR16Op:i16:$Rn, (vector_extract:i16 V128_lo:v8i16:$Rm, (imm:i64)<<P:Predicate_VectorIndexH>>:$idx))))
Included from /work/llvm-rw/lib/Target/AArch64/AArch64.td:58:
/work/llvm-rw/lib/Target/AArch64/AArch64InstrInfo.td:4357:1: error: In SQRDMLAHi16_indexed: Type inference contradiction found, merging 'f16' into 'i16'
defm SQRDMLAH : SIMDIndexedSQRDMLxHSDTied<1, 0b1101, "sqrdmlah",
^
Included from /work/llvm-rw/lib/Target/AArch64/AArch64.td:58:
Included from /work/llvm-rw/lib/Target/AArch64/AArch64InstrInfo.td:283:
/work/llvm-rw/lib/Target/AArch64/AArch64InstrFormats.td:8737:3: note: instantiated from multiclass
  def i16_indexed : BaseSIMDIndexedTied<1, U, 1, 0b01, opc,
  ^
```

================
Comment at: lib/Target/AArch64/AArch64InstrInfo.td:3086
@@ +3085,3 @@
+  // FIXME: uncomment the following, after backend will support i16 neon type
+  //def : Pat<(i16 (int_aarch64_neon_sqadd
+  //                 (i16 FPR16:$Rd),
----------------
jmolloy wrote:
> Again, UMOV/SMOV to implement this.
(the same as discussion above)

================
Comment at: test/CodeGen/AArch64/arm64-neon-v8.1a.ll:230
@@ +229,3 @@
+;define i16 @test_sqrdmlahq_extracted_lane_s16(i16 %acc,<8 x i16> %x, <8 x i16> %v) {
+;; cHECK-LABEL: test_sqrdmlahq_extracted_lane_s16:
+;entry:
----------------
test for second non-compilable pattern, marked with // FIXME: this cannot be processed by TableGen

================
Comment at: test/CodeGen/AArch64/arm64-neon-v8.1a.ll:241
@@ +240,3 @@
+define i32 @test_sqrdmlah_extracted_lane_s32(i32 %acc,<2 x i32> %x, <2 x i32> %v) {
+; CHECK-LABEL: test_sqrdmlah_extracted_lane_s32:
+entry:
----------------
That's a test for really weird pattern above to match weird extra insert_subvector in DAG

================
Comment at: test/CodeGen/AArch64/arm64-neon-v8.1a.ll:269
@@ +268,3 @@
+;define i16 @test_sqrdmlsh_extracted_lane_s16(i16 %acc,<4 x i16> %x, <4 x i16> %v) {
+;; cHECK-LABEL: test_sqrdmlsh_extracted_lane_s16:
+;entry:
----------------
test for first non-compilable pattern, marked with // FIXME: this cannot be processed by TableGen

http://reviews.llvm.org/D8502

EMAIL PREFERENCES
  http://reviews.llvm.org/settings/panel/emailpreferences/