[PATCH] D41792: [AArch64] Add ARMv8.2-A FP16 scalar intrinsics

Wed Jan 10 16:54:43 PST 2018

az marked 8 inline comments as done.
az added inline comments.

================
Comment at: clang/include/clang/Basic/arm_fp16.td:58
+class IInst<string n, string p, string t> : Inst<n, p, t, OP_NONE> {}
+
+// ARMv8.2-A FP16 intrinsics.
----------------
SjoerdMeijer wrote:
> There's a little bit of duplication here: the definitions above are the same as in arm_neon.td. Would it be easy to share this, with e.g. an include?
The duplication is small compared to the overall infrastructure/data structure needed to automatically generate the intrinsics. There are 3 ways to do this: 1) copy only the needed data structure in arm_fp16.td (this is what was done in original review) 2) put all data structure in a newly created file and include it in arm_neon.td and arm_fp16.td (done here). 3) put only the duplication in a new file and include it. I did not go for this one given that we create a new file for the only purpose of avoiding a small duplication but I am fine of going with 3 too. Note that some of the duplicated structure in the original arm_fp16.td was a stripped down version of the copied one.

================
Comment at: clang/lib/CodeGen/CGBuiltin.cpp:4102
   NEONMAP1(vuqadds_s32, aarch64_neon_suqadd, Add1ArgType),
+  // FP16 scalar intrinisics go here.
+  NEONMAP1(vabdh_f16, aarch64_sisd_fabd, Add1ArgType),
----------------
SjoerdMeijer wrote:
> Looks like  a few intrinsic descriptions are missing here. For example, the first 2-operand intrinsic vaddh_f16 is missing, but there are also more. Is this intentional, or might they have slipped through the cracks (or am I missing something)?
I agree that this is confusing. For the intrinsics listed in this table, code generation happens in a generic way based on the info in the table. The ones not listed in this table are addressed in a more specific way below in a the function called EmitAArch64BuiltinExpr. While I do not like how few things were implemented in generating the intrinsics, I am in general following the approach taken for arm_neon instead of introducing a new approach. 

================
Comment at: llvm/include/llvm/IR/IntrinsicsAArch64.td:149
                 [IntrNoMem]>;
+
+  class AdvSIMD_1Arg_Intrinsic
----------------
SjoerdMeijer wrote:
> This and the other changes in this file are changes to LLVM. Do we need these changes for this patch? It doesn't look like it.
Some tests in aarch64-v8.2a-fp16-intrinsics.c will fail for me without these changes. In clang/lib/CodeGen/BackendUtil.cpp, there is code there that includes llvm files and header files. It fails there if I do not fix IntrinsicAArch64.td. If you know of a better way to test differently without the need for llvm, then let me know. For example, if I remove the option flag -S from testing (i.e from aarch64-v8.2a-fp16-intrinsics.c), then there is no need to llvm but I won't be able to compare results.  

https://reviews.llvm.org/D41792