[PATCH] D60828: [ARM] Fix armv8 features tree and add fp16fml
Diogo N. Sampaio via Phabricator via cfe-commits
cfe-commits at lists.llvm.org
Thu Apr 18 01:59:50 PDT 2019
dnsampaio marked 5 inline comments as done.
dnsampaio added inline comments.
================
Comment at: lib/Basic/Targets/ARM.cpp:443
HasLegalHalfType = true;
+ HW_FP |= HW_FP_SP | HW_FP_DP | HW_FP_HP;
+ FPU |= VFP4FPU;
----------------
ostannard wrote:
> Is it always correct to set HW_FP_DP here, now that MVE can have full fp16 without double-precision? I'll add Simon since he's working on that.
If it is V8 and does not have double-precision, isn't that going to use the argument `+fp-only-sp` ? That will disable WH_FP_DP using lines 436 and 456.
```
else if (Feature == "+fp-only-sp") {
HW_FP_remove |= HW_FP_DP;
```
```
HW_FP &= ~HW_FP_remove;
```
================
Comment at: lib/Basic/Targets/ARM.cpp:444
+ HW_FP |= HW_FP_SP | HW_FP_DP | HW_FP_HP;
+ FPU |= VFP4FPU;
} else if (Feature == "+dotprod") {
----------------
ostannard wrote:
> Should this be FPARMV8, since fullfp16 doesn't apply to earlier architectures? Maybe MVE complicates this even further?
Correct, it should also add FeatureFPARMv8.
See that this is a accumulative flag, as FPARMv8 implies VFP4FPU, both should be set.
Actually, according the backend, `FeatureFPARMv8 -> FeatureVFP4 -> FeatureVFP3 -> FeatureVFP2`. From my tests we already set FeatureVFP3, but not FeatureVFP4
================
Comment at: lib/Basic/Targets/ARM.cpp:446
} else if (Feature == "+dotprod") {
+ FPU |= NeonFPU;
+ HW_FP |= HW_FP_SP | HW_FP_DP;
----------------
ostannard wrote:
> Should this also add FPARMV8?
As well, yes.
================
Comment at: lib/Basic/Targets/ARM.cpp:452
+ HasLegalHalfType = true;
+ FPU |= VFP4FPU;
+ HW_FP |= HW_FP_SP | HW_FP_DP | HW_FP_HP;
----------------
ostannard wrote:
> Again, should this be FPARMV8?
True.
================
Comment at: test/CodeGen/arm_neon_intrinsics.c:8
+// RUN: %clang -O1 -target armv8a-linux-eabi -march=armv8a+fp16fml\
+// RUN: -S -emit-llvm -o - %s | FileCheck %s.v8
+
----------------
ostannard wrote:
> Does the generate code differ enough to have a second copy of it? Actually, I assume the problem here is that we are not setting the correct preprocessor macros? in which case, it would be better to test them directly, than by re-running this 21kloc test.
That indeed seems a better solution.
Repository:
rC Clang
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D60828/new/
https://reviews.llvm.org/D60828
More information about the cfe-commits
mailing list