[PATCH] D41792: [AArch64] Add ARMv8.2-A FP16 scalar intrinsics
Abderrazek Zaafrani via Phabricator via cfe-commits
cfe-commits at lists.llvm.org
Thu Jan 11 14:26:11 PST 2018
az marked 6 inline comments as done.
az added inline comments.
================
Comment at: clang/include/clang/Basic/arm_fp16.td:58
+class IInst<string n, string p, string t> : Inst<n, p, t, OP_NONE> {}
+
+// ARMv8.2-A FP16 intrinsics.
----------------
SjoerdMeijer wrote:
> az wrote:
> > SjoerdMeijer wrote:
> > > There's a little bit of duplication here: the definitions above are the same as in arm_neon.td. Would it be easy to share this, with e.g. an include?
> > The duplication is small compared to the overall infrastructure/data structure needed to automatically generate the intrinsics. There are 3 ways to do this: 1) copy only the needed data structure in arm_fp16.td (this is what was done in original review) 2) put all data structure in a newly created file and include it in arm_neon.td and arm_fp16.td (done here). 3) put only the duplication in a new file and include it. I did not go for this one given that we create a new file for the only purpose of avoiding a small duplication but I am fine of going with 3 too. Note that some of the duplicated structure in the original arm_fp16.td was a stripped down version of the copied one.
> Given that the duplication is tiny, I don't have strong opinions to be honest. Would be nice to share these definitions if that's easy to do, otherwise we can perfectly live with this I think.
So, let's keep the current version for now which is: all generic stuff goes into the file called arm_neon_incl.td. All specific code that creates and generates the intrinsic goes into specific files arm_neon.td and arm_fp16.td which include the generic file. This will work well when we create new .td file for future features if any.
================
Comment at: llvm/include/llvm/IR/IntrinsicsAArch64.td:250
def int_aarch64_neon_umax : AdvSIMD_2VectorArg_Intrinsic;
- def int_aarch64_neon_fmax : AdvSIMD_2VectorArg_Intrinsic;
+ def int_aarch64_neon_fmax : AdvSIMD_2FloatArg_Intrinsic;
def int_aarch64_neon_fmaxnmp : AdvSIMD_2VectorArg_Intrinsic;
----------------
SjoerdMeijer wrote:
> There's a scalar and vector variant of FMAX and thus I am wondering if we don't need two definitions here: one using AdvSIMD_2FloatArg_Intrinsic and the other AdvSIMD_2VectorArg_Intrinsic?
Maybe we can do that but there are many instances where vector and scalar share the same intrinsic name ( note that the type such as f16 or v4f32 is appended to that name). I have not checked carefully if what you propose already exists or not but it does seem less common from first look.
https://reviews.llvm.org/D41792
More information about the cfe-commits
mailing list