[PATCH] D43414: AMDGPU: Define FP_FAST_FMA{F} macros for amdgcn

Tony Tye via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Feb 16 15:24:57 PST 2018

t-tye added inline comments.

Comment at: lib/Basic/Targets/AMDGPU.cpp:345-348
+  if (getTriple().getArch() == llvm::Triple::amdgcn) {
+    Builder.defineMacro("FP_FAST_FMA");
+    Builder.defineMacro("FP_FAST_FMAF");
+  }
b-sumner wrote:
> t-tye wrote:
> > Do all amdgcn targets have fast FMA? @b-sumner can you clarify?
> No.  All targets that support double precision should report FAST_FMA.  Only targets with full rate v_fma_f32 should report FAST_FMAF
It is unfortunate that clang does not have access to the processor features defined in the td files which gives the settings for each target.


More information about the llvm-commits mailing list