[PATCH] Refactor and enhance FMA combine

Mehdi AMINI mehdi.amini at apple.com
Mon Mar 23 13:48:25 PDT 2015


================
Comment at: lib/CodeGen/SelectionDAG/DAGCombiner.cpp:7060
@@ +7059,3 @@
+      // fold (fadd (fpext (fma x, y, (fmul u, v))), z)
+      //   -> (fma (fpext x), (fpext y), (fma (fpext u), (fpext v), z))
+      if (N0.getOpcode() == ISD::FP_EXTEND) {
----------------
ohsallen wrote:
> joker.eph wrote:
> > This turns two "low precision" and one "high precision" operations into two "high precision" operation. 
> > I'm not saying it is necessarily bad, but I'm not convinced it is always beneficial.
> > The other FMA combines don't have the same behavior.
> In principle you're right, that might not be *always* beneficial. But in general, it should be, because even when "high precision" operations are twice more expensive than "low precision" one, the transformation does not worsen things. Right now this is only enabled for PPC, for which low and high precision operations have the same cost. Tell me if this is not acceptable.
Well you can imagine having more than twice the throughput in f16 than f32 on some targets, and you can also imagine that 2 x f16 operations consume less power than one f32.
I'd rather have Owen's opinion on this.

================
Comment at: lib/CodeGen/SelectionDAG/DAGCombiner.cpp:7214
@@ +7213,3 @@
+    // fold (fsub (fpext (fneg (fmul, x, y))), z)
+    //   -> (fneg (fma (fpext x), (fpext y), z))
+    if (N0.getOpcode() == ISD::FP_EXTEND) {
----------------
ohsallen wrote:
> joker.eph wrote:
> > I come back with the canonicalization problem. Wouldn't it be more canonical to have:
> >   fneg (fadd (fpext (fmul, x, y)), z)
> > 
> > In which case this combine would be useless.
> This code is triggered when TLI.isFPExtFree() returns true. The FPEXT nodes are only generated for correctness, and are expected to be removed later by a MC pass. Thus the code generated here is more canonical in this context.
I'm not sure you understood my comment. I'm talking about the input pattern, not the output pattern.

http://reviews.llvm.org/D8050

EMAIL PREFERENCES
  http://reviews.llvm.org/settings/panel/emailpreferences/






More information about the llvm-commits mailing list