[PATCH] D28087: X86 instr selection: combine ADDSUB + MUL to FMADDSUB
Elena Demikhovsky via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Sun Jan 1 23:38:51 PST 2017
delena added a comment.
In https://reviews.llvm.org/D28087#633340, @v_klochkov wrote:
> So, if I want to make it possible to generate 512-bit F**M**ADDSUB instrutions, then I'll need to change a little bit the recognition of ADDSUB idioms, i.e. it should be possible to recognize ADDSUB, but never generate 512-bit X86ISD::ADDSUBs.
Does it make sense to convert 512-bit operation ADDSUB to FMADDSUB with all-ones multiplier? What ICC does?
================
Comment at: llvm/lib/Target/X86/X86ISelLowering.cpp:32057
+ const TargetOptions &Options = DAG.getTarget().Options;
+ bool AllowFusion =
+ (Options.AllowFPOpFusion == FPOpFusion::Fast || Options.UnsafeFPMath);
----------------
I see more checks in visitFADDForFMACombine. Can we take all these checks to a separate function? Something like isFMALegalAndProfitable().
================
Comment at: llvm/test/CodeGen/X86/fmaddsub-combine.ll:19
+ %Addsub = shufflevector <2 x double> %Sub, <2 x double> %Add, <2 x i32> <i32 0, i32 3>
+ ret <2 x double> %Addsub
+}
----------------
I don't understand the dependency you show in the test.
I suppose that FMADDSUB test should be like this:
%AB = fmul <2 x double> %A, %B
%Sub = fsub <2 x double> %AB, %C
%Add = fadd <2 x double> %AB, %C
%Addsub = shufflevector <2 x double> %Sub, <2 x double> %Add, <2 x i32> <i32 0, i32 3>
Do you check the "fast" attribute of the operation itself? Or you rely on global options only?
https://reviews.llvm.org/D28087
More information about the llvm-commits
mailing list