[PATCH] D53633: [AArch64] Implement FP16FML intrinsics

Ahmed Bougacha via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Fri Feb 15 14:11:50 PST 2019


ab added inline comments.


================
Comment at: cfe/trunk/test/CodeGen/aarch64-neon-fp16fml.c:12
+
+float32x2_t test_vfmlal_low_u32(float32x2_t a, float16x4_t b, float16x4_t c) {
+// CHECK-LABEL: define <2 x float> @test_vfmlal_low_u32(<2 x float> %a, <4 x half> %b, <4 x half> %c)
----------------
SjoerdMeijer wrote:
> SjoerdMeijer wrote:
> > ab wrote:
> > > Hey folks, I'm curious: where does the "_u32" suffix come from? Should it be _f16?
> > > 
> > > Also, are there any new ACLE/intrinsic list documents? As far as I can tell there hasn't been any release since IHI0073B/IHI0053D.
> > > Also, are there any new ACLE/intrinsic list documents? As far as I can tell there hasn't been any release since IHI0073B/IHI0053D.
> > 
> > I've checked, and an updated ACLE that includes these FP16FML intrinsics is coming soon.
> > 
> > > where does the "_u32" suffix come from? Should it be _f16?
> > 
> > Good question. It could probably be _f32 or _f16, but _u32 doesn't seem to make much sense. Looks like the spec says _u32, and that's also what GCC has implemented. I think we want to update the spec and fix the name before the updated spec is available. Will chase this, and let you know once I know more.
> An update on this: we should change this to _f32 (because the first suffixes were refering to the ouput type). The ACLE will be updated accordingly, and also GCC will change its current implementation (from _u32 to _f32).  Many thanks for raising this issue.
> Is there a volunteer to prepare a patch? Or do you have one already? :-) I could look at it, but that will be towards the end of next week.
> I've checked, and an updated ACLE that includes these FP16FML intrinsics is coming soon.

Great, thanks!

> An update on this: we should change this to _f32 (because the first suffixes were refering to the ouput type).

Hmm, I was thinking _f16 based on the vmlal intrinsics: they seem to be named after the multiplication type rather than that of the accumulator/output.

Either way seems fine to me though, I'll defer to you folks.

> The ACLE will be updated accordingly, and also GCC will change its current implementation (from _u32 to _f32). Many thanks for raising this issue.
Is there a volunteer to prepare a patch? Or do you have one already? :-) I could look at it, but that will be towards the end of next week.

Sure: D58306 (with _f16 though, let me know what you think of vmlal)

Thanks for checking!


Repository:
  rL LLVM

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D53633/new/

https://reviews.llvm.org/D53633





More information about the cfe-commits mailing list