[PATCH] D67434: [InstCombine] Limit FMul constant folding for fma simplifications.

Wed Sep 11 10:18:55 PDT 2019

reames accepted this revision.
reames added a comment.
This revision is now accepted and ready to land.

LGTM, subject to incorporate all comments below before landing.

================
Comment at: llvm/include/llvm/Analysis/InstructionSimplify.h:146
+/// Given operands for the multiplication of a FMA, fold the result or return
+/// null. In contrast to SimplifyFMulInst, this function won't return values
+/// requiring rounding.
----------------
Suggested tweak: In contrast to SimplifyFMulInst, this function will not perform simplifications whose unrounded results differ when rounded to the argument type.  

================
Comment at: llvm/lib/Analysis/InstructionSimplify.cpp:4536

-/// Given the operands for an FMul, see if we can fold the result
-static Value *SimplifyFMulInst(Value *Op0, Value *Op1, FastMathFlags FMF,
-                               const SimplifyQuery &Q, unsigned MaxRecurse) {
-  if (Constant *C = foldOrCommuteConstant(Instruction::FMul, Op0, Op1, Q))
-    return C;
-
-  if (Constant *C = simplifyFPBinop(Op0, Op1))
-    return C;
-
+/// Given operands for the multiplication of a FMA, fold the result or return
+/// null. In contrast to SimplifyFMulInst, this function won't return values
----------------
Please don't repeat the doc comment.  Just in the header is fine.

================
Comment at: llvm/lib/Analysis/InstructionSimplify.cpp:4545

+  // fmul 1.0, X ==> X
+  if (match(Op0, m_FPOne()))
----------------
These two appear to be new transforms.  Please separate the basic optimization which is mostly refactoring plus hooks (this review) and a following optimization change.

================
Comment at: llvm/test/Transforms/InstCombine/fma.ll:502-511
+define <2 x double> @fmuladd_const_fmul(<2 x double> %b) {
+; CHECK-LABEL: @fmuladd_const_fmul(
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    [[RES:%.*]] = fadd nnan nsz <2 x double> [[B:%.*]], <double 0x3F8DB6C076AD949B, double 0x3FB75A405B6E6D69>
+; CHECK-NEXT:    ret <2 x double> [[RES]]
+;
+entry:
----------------
lebedev.ri wrote:
> fhahn wrote:
> > reames wrote:
> > > lebedev.ri wrote:
> > > > reames wrote:
> > > > > reames wrote:
> > > > > > lebedev.ri wrote:
> > > > > > > reames wrote:
> > > > > > > > lebedev.ri wrote:
> > > > > > > > > I don't think this restriction should apply to `@llvm.fmuladd`, as per https://llvm.org/docs/LangRef.html#llvm-fmuladd-intrinsic
> > > > > > > > I don't see why not?  The rounding is specified exactly as described.
> > > > > > > I think my comment got garbled. I'm saying the current patch is correct in the sense that
> > > > > > > if we are simplifying `fmuladd`, we don't have any rounding concerns:
> > > > > > > As per LangRef, `fmuladd` can be arbitrarily expanded to `fmul`+`fadd`, unlike `fma`.
> > > > > > > 
> > > > > > From the langref: "is equivalent to the expression a * b + c, *except that rounding will not be performed* between the multiplication and addition steps if the code generator fuses the operations. "
> > > > > > 
> > > > > > Once fused, this implies the rounding is defined.  We can consider changing the LangRef - I think we maybe should - but per the actual wording, we must do the rounding as per an fma, not the component operations.
> > > > > p.s.  Looking at the implementation, particularly SelectionDAGBuilder, I think you're right about the intent.  We just need to clarify the semantics in LangRef.  
> > > > I fail to read langref any other way already:
> > > > 
> > > > > The ‘llvm.fmuladd.*’ intrinsic functions represent multiply-add expressions that can be fused if the code generator determines that (a) the target instruction set has support for a fused operation, and (b) that the fused operation is more efficient than the equivalent, separate pair of mul and add instructions.
> > > > 
> > > > "can be fused **if**"
> > > > 
> > > > > The expression:
> > > > > `%0 = call float @llvm.fmuladd.f32(%a, %b, %c)`
> > > > > is equivalent to the expression a * b + c, except that rounding will not be performed between the multiplication and addition steps if the code generator fuses the operations. Fusion is not guaranteed, even if the target platform supports it. If a fused multiply-add is required, the corresponding llvm.fma intrinsic function should be used instead.
> > > > 
> > > > "rounding will not be performed between the multiplication and addition steps **if** the code generator fuses the operations"
> > > > "Fusion is not guaranteed"
> > > The key bit of wording here is that the rounding is described *following the decision to fold*.  
> > > 
> > > I'm not disputing that converting " a * b + c" to a fmuladd is valid per this description.  I'm disputing that converting *back* to an "a * b + c" sequence is technically disallowed by the wording.  
> > > The key bit of wording here is that the rounding is described *following the decision to fold*.
> > 
> > My understanding is that as long as it is a fmuladd,  the code generator did not make a decision about fusing yet. IIUC there are 2 ways it can decide to fuse: 1) replace it with an fma intrinsic or lower it to a FMA in the backend.
> I would agree with that if not for the 
> > Fusion is not guaranteed, even if the target platform supports it. If a fused multiply-add is required, the corresponding llvm.fma intrinsic function should be used instead.
> to me that sounds: "use if fmulfadd if you want to request to do FMA; or fma if you want to require FMA (if unavailable compilation will fail)"
Ok, let me rephrase how I'm approaching this.

As a reviewer, I have shared a concern about a potential ambiguity in the LangRef which effects the legality of the discussed transform.  Whether there is an agreement that the ambiguity exists or not, I expect a change made to clarify the semantics to remove the potential ambiguity.  Please consider this a hard requirement before any change to exploit the discussed semantics as proposed in the comment that started this sub-thread.  

(This entire sub-thread is off topic for the actual review.)

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D67434/new/

https://reviews.llvm.org/D67434