[PATCH] D28087: X86 instr selection: combine ADDSUB + MUL to FMADDSUB

Elena Demikhovsky via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Sun Jan 1 23:38:51 PST 2017


delena added a comment.

In https://reviews.llvm.org/D28087#633340, @v_klochkov wrote:

> So, if I want to make it possible to generate 512-bit F**M**ADDSUB instrutions, then I'll need to change a little bit the recognition of ADDSUB idioms, i.e. it should be possible to recognize ADDSUB, but never generate 512-bit X86ISD::ADDSUBs.


Does it make sense to convert 512-bit operation ADDSUB to FMADDSUB with all-ones multiplier? What ICC does?



================
Comment at: llvm/lib/Target/X86/X86ISelLowering.cpp:32057
+  const TargetOptions &Options = DAG.getTarget().Options;
+  bool AllowFusion =
+      (Options.AllowFPOpFusion == FPOpFusion::Fast || Options.UnsafeFPMath);
----------------
I see more checks in visitFADDForFMACombine. Can we take all these checks to a separate function? Something like isFMALegalAndProfitable().


================
Comment at: llvm/test/CodeGen/X86/fmaddsub-combine.ll:19
+  %Addsub = shufflevector <2 x double> %Sub, <2 x double> %Add, <2 x i32> <i32 0, i32 3>
+  ret <2 x double> %Addsub
+}
----------------
I don't understand the dependency you show in the test.
I suppose that FMADDSUB test should be like this:

  %AB = fmul <2 x double> %A, %B
  %Sub = fsub <2 x double> %AB, %C
  %Add = fadd <2 x double> %AB, %C
  %Addsub = shufflevector <2 x double> %Sub, <2 x double> %Add, <2 x i32> <i32 0, i32 3>

Do you check the "fast" attribute of the operation itself? Or you rely on global options only?



https://reviews.llvm.org/D28087





More information about the llvm-commits mailing list