[PATCH] D43414: AMDGPU: Define FP_FAST_FMA{F} macros for amdgcn
Tony Tye via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Fri Feb 16 15:24:57 PST 2018
t-tye added inline comments.
================
Comment at: lib/Basic/Targets/AMDGPU.cpp:345-348
+ if (getTriple().getArch() == llvm::Triple::amdgcn) {
+ Builder.defineMacro("FP_FAST_FMA");
+ Builder.defineMacro("FP_FAST_FMAF");
+ }
----------------
b-sumner wrote:
> t-tye wrote:
> > Do all amdgcn targets have fast FMA? @b-sumner can you clarify?
> No. All targets that support double precision should report FAST_FMA. Only targets with full rate v_fma_f32 should report FAST_FMAF
>
It is unfortunate that clang does not have access to the processor features defined in the td files which gives the settings for each target.
https://reviews.llvm.org/D43414
More information about the llvm-commits
mailing list