[PATCH] D14909: [X86][FMA] Optimize FNEG(FMUL) Patterns
Simon Pilgrim via llvm-commits
llvm-commits at lists.llvm.org
Wed Nov 25 01:42:19 PST 2015
RKSimon added a comment.
Elena please can you confirm if we need hasAVX512() as well as hasFMA() ?
================
Comment at: lib/Target/X86/X86ISelLowering.cpp:26161
@@ +26160,3 @@
+ // use of a constant by performing (0 - A*B) instead.
+ if (Arg.getOpcode() == ISD::FMUL && DAG.getTarget().Options.UnsafeFPMath &&
+ (SVT == MVT::f32 || SVT == MVT::f64) &&
----------------
spatel wrote:
> Do we need unsafe math for this transform?
IMO no its not necessary, I was just being conservative - subtract from zero at type or internal precision should give the same result.
================
Comment at: lib/Target/X86/X86ISelLowering.cpp:26163
@@ +26162,3 @@
+ (SVT == MVT::f32 || SVT == MVT::f64) &&
+ (Subtarget->hasFMA() || Subtarget->hasFMA4() || Subtarget->hasAVX512())) {
+ SDLoc DL(N);
----------------
spatel wrote:
> Aren't the FMA checks enough? Is there some target that assumes that AVX512 includes FMA but does not set the FMA attribute?
This is what we're doing in all similar cases - I don't have a CPUID spec that discusses whether FMA is guaranteed for AVX512.
================
Comment at: lib/Target/X86/X86ISelLowering.cpp:26164
@@ +26163,3 @@
+ (Subtarget->hasFMA() || Subtarget->hasFMA4() || Subtarget->hasAVX512())) {
+ SDLoc DL(N);
+ SDValue Zero = DAG.getConstantFP(0.0, DL, VT);
----------------
spatel wrote:
> Nit: could hoist this and reuse below.
Easily done - we seem to be very inconsistent as to whether we hoist SDLoc (and sometimes not use it) or only generate them if we actually perform the combine.
Repository:
rL LLVM
http://reviews.llvm.org/D14909
More information about the llvm-commits
mailing list