[PATCH] D67434: [InstCombine] Limit FMul constant folding for fma simplifications.

Wed Sep 11 09:43:41 PDT 2019

reames added inline comments.

================
Comment at: llvm/test/Transforms/InstCombine/fma.ll:502-511
+define <2 x double> @fmuladd_const_fmul(<2 x double> %b) {
+; CHECK-LABEL: @fmuladd_const_fmul(
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    [[RES:%.*]] = fadd nnan nsz <2 x double> [[B:%.*]], <double 0x3F8DB6C076AD949B, double 0x3FB75A405B6E6D69>
+; CHECK-NEXT:    ret <2 x double> [[RES]]
+;
+entry:
----------------
lebedev.ri wrote:
> reames wrote:
> > reames wrote:
> > > lebedev.ri wrote:
> > > > reames wrote:
> > > > > lebedev.ri wrote:
> > > > > > I don't think this restriction should apply to `@llvm.fmuladd`, as per https://llvm.org/docs/LangRef.html#llvm-fmuladd-intrinsic
> > > > > I don't see why not?  The rounding is specified exactly as described.
> > > > I think my comment got garbled. I'm saying the current patch is correct in the sense that
> > > > if we are simplifying `fmuladd`, we don't have any rounding concerns:
> > > > As per LangRef, `fmuladd` can be arbitrarily expanded to `fmul`+`fadd`, unlike `fma`.
> > > > 
> > > From the langref: "is equivalent to the expression a * b + c, *except that rounding will not be performed* between the multiplication and addition steps if the code generator fuses the operations. "
> > > 
> > > Once fused, this implies the rounding is defined.  We can consider changing the LangRef - I think we maybe should - but per the actual wording, we must do the rounding as per an fma, not the component operations.
> > p.s.  Looking at the implementation, particularly SelectionDAGBuilder, I think you're right about the intent.  We just need to clarify the semantics in LangRef.  
> I fail to read langref any other way already:
> 
> > The ‘llvm.fmuladd.*’ intrinsic functions represent multiply-add expressions that can be fused if the code generator determines that (a) the target instruction set has support for a fused operation, and (b) that the fused operation is more efficient than the equivalent, separate pair of mul and add instructions.
> 
> "can be fused **if**"
> 
> > The expression:
> > `%0 = call float @llvm.fmuladd.f32(%a, %b, %c)`
> > is equivalent to the expression a * b + c, except that rounding will not be performed between the multiplication and addition steps if the code generator fuses the operations. Fusion is not guaranteed, even if the target platform supports it. If a fused multiply-add is required, the corresponding llvm.fma intrinsic function should be used instead.
> 
> "rounding will not be performed between the multiplication and addition steps **if** the code generator fuses the operations"
> "Fusion is not guaranteed"
The key bit of wording here is that the rounding is described *following the decision to fold*.  

I'm not disputing that converting " a * b + c" to a fmuladd is valid per this description.  I'm disputing that converting *back* to an "a * b + c" sequence is technically disallowed by the wording.  

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D67434/new/

https://reviews.llvm.org/D67434