[PATCH] D60828: [ARM] Fix armv8 features tree and add fp16fml

Thu Apr 18 01:59:50 PDT 2019

dnsampaio marked 5 inline comments as done.
dnsampaio added inline comments.

================
Comment at: lib/Basic/Targets/ARM.cpp:443
       HasLegalHalfType = true;
+      HW_FP |= HW_FP_SP | HW_FP_DP | HW_FP_HP;
+      FPU |= VFP4FPU;
----------------
ostannard wrote:
> Is it always correct to set HW_FP_DP here, now that MVE can have full fp16 without double-precision? I'll add Simon since he's working on that.
If it is V8 and does not have double-precision, isn't that going to use the argument `+fp-only-sp` ? That will disable WH_FP_DP  using lines 436 and 456.
```
else if (Feature == "+fp-only-sp") {
HW_FP_remove |= HW_FP_DP;
```
```
HW_FP &= ~HW_FP_remove;
```

================
Comment at: lib/Basic/Targets/ARM.cpp:444
+      HW_FP |= HW_FP_SP | HW_FP_DP | HW_FP_HP;
+      FPU |= VFP4FPU;
     } else if (Feature == "+dotprod") {
----------------
ostannard wrote:
> Should this be FPARMV8, since fullfp16 doesn't apply to earlier architectures? Maybe MVE  complicates this even further?
Correct, it should also add FeatureFPARMv8.
See that this is a accumulative flag, as FPARMv8 implies VFP4FPU, both should be set.
 Actually, according the backend, `FeatureFPARMv8 -> FeatureVFP4 -> FeatureVFP3 -> FeatureVFP2`. From my tests we already set FeatureVFP3, but not FeatureVFP4

================
Comment at: lib/Basic/Targets/ARM.cpp:446
     } else if (Feature == "+dotprod") {
+      FPU |= NeonFPU;
+      HW_FP |= HW_FP_SP | HW_FP_DP;
----------------
ostannard wrote:
> Should this also add FPARMV8?
As well, yes.

================
Comment at: lib/Basic/Targets/ARM.cpp:452
+      HasLegalHalfType = true;
+      FPU |= VFP4FPU;
+      HW_FP |= HW_FP_SP | HW_FP_DP | HW_FP_HP;
----------------
ostannard wrote:
> Again, should this be FPARMV8?
True.

================
Comment at: test/CodeGen/arm_neon_intrinsics.c:8
+// RUN: %clang -O1 -target armv8a-linux-eabi -march=armv8a+fp16fml\
+// RUN:  -S -emit-llvm -o - %s | FileCheck %s.v8
+
----------------
ostannard wrote:
> Does the generate code differ enough to have a second copy of it? Actually, I assume the problem here is that we are not setting the correct preprocessor macros? in which case, it would be better to test them directly, than by re-running this 21kloc test.
That indeed seems a better solution.

Repository:
  rC Clang

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D60828/new/

https://reviews.llvm.org/D60828