[PATCH] [AArch64] Handle vec4 sitofp and uitofp for half

Wed Apr 22 14:17:51 PDT 2015

LGTM as well (with a couple of nits), thanks!


================
Comment at: lib/Target/AArch64/AArch64ISelLowering.cpp:554-563
@@ -553,8 +553,12 @@
 
     // AArch64 doesn't have a direct vector ->f32 conversion instructions for
     // elements smaller than i32, so promote the input to i32 first.
     setOperationAction(ISD::UINT_TO_FP, MVT::v4i8, Promote);
     setOperationAction(ISD::SINT_TO_FP, MVT::v4i8, Promote);
     setOperationAction(ISD::UINT_TO_FP, MVT::v4i16, Promote);
     setOperationAction(ISD::SINT_TO_FP, MVT::v4i16, Promote);
+    setOperationAction(ISD::SINT_TO_FP, MVT::v8i8, Promote);
+    setOperationAction(ISD::UINT_TO_FP, MVT::v8i8, Promote);
+    setOperationAction(ISD::SINT_TO_FP, MVT::v8i16, Promote);
+    setOperationAction(ISD::UINT_TO_FP, MVT::v8i16, Promote);
     // Similarly, there is no direct i32 -> f64 vector conversion instruction.
----------------
Like you did below, another comment for these might be useful, to make it clear v8iN is possible when you do ->v8f16, like v4iN is when you do ->v4f32.

================
Comment at: test/CodeGen/AArch64/fp16-v16-instructions.ll:24-31
@@ +23,10 @@
+; CHECK-LABEL: sitofp_i64:
+; CHECK-DAG: scvtf [[D0:v[0-9]+\.2d]], v0.2d
+; CHECK-DAG: scvtf [[D1:v[0-9]+\.2d]], v1.2d
+; CHECK-DAG: scvtf [[D2:v[0-9]+\.2d]], v2.2d
+; CHECK-DAG: scvtf [[D3:v[0-9]+\.2d]], v3.2d
+; CHECK-DAG: scvtf [[D4:v[0-9]+\.2d]], v4.2d
+; CHECK-DAG: scvtf [[D5:v[0-9]+\.2d]], v5.2d
+; CHECK-DAG: scvtf [[D6:v[0-9]+\.2d]], v6.2d
+; CHECK-DAG: scvtf [[D7:v[0-9]+\.2d]], v7.2d
+
----------------
And this kind of thing is why I abhor the ARM-style NEON syntax, and much prefer Apple's.

Though conversions aren't a very good example, and you can just match the register number rather than the entire thing, like you do for S0 below.  Anyway, enough ranting, I'm fine with this ;)

================
Comment at: test/CodeGen/AArch64/fp16-v8-instructions.ll:340-341
@@ +339,4 @@
+; CHECK-LABEL: uitofp_i32:
+; CHECK: ucvtf [[OP1:v[0-9]+\.4s]], v0.4s
+; CHECK: fcvtn v0.4h, [[OP1]]
+  %1 = uitofp <8 x i32> %a to <8 x half>
----------------
What about the higher lanes? (another rant: that's why I like over-using -NEXT, it avoids this kind of question entirely).

http://reviews.llvm.org/D9166

EMAIL PREFERENCES
  http://reviews.llvm.org/settings/panel/emailpreferences/