[PATCH] D26602: [DAGCombiner] do not fold (fmul (fadd X, 1), Y) -> (fmad X, Y, Y) by default

Mon Nov 14 09:22:04 PST 2016

spatel added a comment.

In https://reviews.llvm.org/D26602#594424, @nhaehnle wrote:

> The optimization makes use of the distributive law, not the associative law. As such I don't think there should be an immediate connection to a flag named -fassociative-math. But maybe that just means that the flag is badly named :-)
>
> Personally, I don't care about Clang flags since I'm worried about our OpenGL frontend which uses LLVM directly, but it seems to me that there are meaningful associativity transforms that can be done even when infs are possible. Are (a + b) + c --> a + (b + c) and (a * b) * c --> a * (b * c) problematic if any of the variables are inf or nan? I don't think so, but perhaps I'm missing something.

Good points. At the very least, we should rename this function (visitFMULForFMACombine) and/or add code comments to make it clear that we're only dealing with the case of a '+/-1.0' FADD/FSUB operand. I think that should be done ahead of this patch as an NFC commit.

So let's use the codegen definitions since there's no hope of sorting out the connection to the higher-level definitions in this patch. :)

1. FPOpFusion::Fast - Enable fusion of FP ops wherever it's profitable.

2. UnsafeFPMath - This flag is enabled when the -enable-unsafe-fp-math flag is specified on the command line. When this flag is off (the default), the code generator is not allowed to produce results that are "less precise" than IEEE allows...UnsafeFPMath implies LessPreciseFPMAD.

3. LessPreciseFPMADOption - This flag is enabled when the -enable-fp-mad is specified on the command line.  When this flag is off (the default), the code generator is not allowed to generate mad (multiply add) if the result is "less precise" than doing those operations individually.

4. NoInfsFPMath - This flag is enabled when the -enable-no-infs-fp-math flag is specified on the command line. When this flag is off (the default), the code generator is not allowed to assume the FP arithmetic arguments and results are never +-Infs.

Does NoInfsFPMath override FPOpFusionFast? Or do we need another enum value/flag to answer that question? Are there other transforms that need to be aware of this interaction?

https://reviews.llvm.org/D26602