<html>

<head>

<meta http-equiv="Content-Type" content="text/html; charset=us-ascii">

<style type="text/css" style="display:none;"> P {margin-top:0;margin-bottom:0;} </style>

</head>

<body dir="ltr">

<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">

Hi,</div>

<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">

Which version of Clang are you using? I do get a "vfma.f16" with a recent trunk build. I haven't looked at older versions and when this landed, but we had an effort to plug the remaining fp16 holes not that long ago, so again hopefully a newer version will

 just work for you.</div>

<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">

<br>

</div>

<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">

Cheers,</div>

<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">

Sjoerd.</div>

<div id="appendonsend"></div>

<hr style="display:inline-block;width:98%" tabindex="-1">

<div id="divRplyFwdMsg" dir="ltr"><font face="Calibri, sans-serif" style="font-size:11pt" color="#000000"><b>From:</b> llvm-dev <llvm-dev-bounces@lists.llvm.org> on behalf of Yizhi Liu via llvm-dev <llvm-dev@lists.llvm.org><br>

<b>Sent:</b> 05 September 2019 06:52<br>

<b>To:</b> llvm-dev@lists.llvm.org <llvm-dev@lists.llvm.org><br>

<b>Subject:</b> [llvm-dev] ARM vectorized fp16 support</font>

<div> </div>

</div>

<div class="BodyFragment"><font size="2"><span style="font-size:11pt;">

<div class="PlainText">Hi,<br>

<br>

I'm trying to compile half precision program for ARM, while it seems<br>

LLVM fails to automatically generate fused-multiply-add instructions<br>

for c += a * b. I'm wondering whether I did something wrong, if not,<br>

is it a missing feature that will be supported later? (I know there're<br>

fp16 FMLA intrinsics though)<br>

<br>

Test programs and outputs,<br>

<br>

$ clang -O3 -march=armv8.2-a+fp16fml -ffast-math -S -o- vfp32.c<br>

test_vfma_lane_f16:                     // @test_vfma_lane_f16<br>

                fmla       v2.4s, v1.4s, v0.4s   // fp32 is GOOD<br>

                mov       v0.16b, v2.16b<br>

                ret<br>

$ cat vfp32.c<br>

#include <arm_neon.h><br>

float32x4_t test_vfma_lane_f16(float32x4_t a, float32x4_t b, float32x4_t c) {<br>

  c += a * b;<br>

  return c;<br>

}<br>

<br>

$ clang -O3 -march=armv8.2-a+fp16fml -ffast-math -S -o- vfp16.c<br>

test_vfma_lane_f16:                     // @test_vfma_lane_f16<br>

                fmul       v0.4h, v1.4h, v0.4h<br>

                fadd       v0.4h, v0.4h, v2.4h  // fp16 does NOT use FMLA<br>

                ret<br>

$ cat vfp16.c<br>

#include <arm_neon.h><br>

float16x4_t test_vfma_lane_f16(float16x4_t a, float16x4_t b, float16x4_t c) {<br>

  c += a * b;<br>

  return c;<br>

}<br>

<br>

-- <br>

Yizhi Liu<br>

_______________________________________________<br>

LLVM Developers mailing list<br>

llvm-dev@lists.llvm.org<br>

<a href="https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev">https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a><br>

</div>

</span></font></div>

</body>

</html>