[llvm] [InstCombine] Canonicalize reassoc contract fmuladd to fmul + fadd (PR #90434)

Wed May 8 14:15:32 PDT 2024

================
@@ -204,6 +204,34 @@ define float @fmuladd_fneg_x_fneg_y_fast(float %x, float %y, float %z) {
   ret float %fmuladd
 }
 
+define float @fmuladd_unfold(float %x, float %y, float %z) {
+; CHECK-LABEL: @fmuladd_unfold(
+; CHECK-NEXT:    [[TMP1:%.*]] = fmul reassoc contract float [[X:%.*]], [[Y:%.*]]
+; CHECK-NEXT:    [[FMULADD:%.*]] = fadd reassoc contract float [[TMP1]], [[Z:%.*]]
+; CHECK-NEXT:    ret float [[FMULADD]]
+;
+  %fmuladd = call reassoc contract float @llvm.fmuladd.f32(float %x, float %y, float %z)
+  ret float %fmuladd
+}
+
+define float @fmuladd_unfold_missing_reassoc(float %x, float %y, float %z) {
+; CHECK-LABEL: @fmuladd_unfold_missing_reassoc(
+; CHECK-NEXT:    [[FMULADD:%.*]] = call contract float @llvm.fmuladd.f32(float [[X:%.*]], float [[Y:%.*]], float [[Z:%.*]])
+; CHECK-NEXT:    ret float [[FMULADD]]
+;
+  %fmuladd = call contract float @llvm.fmuladd.f32(float %x, float %y, float %z)
+  ret float %fmuladd
+}
+
+define float @fmuladd_unfold_missing_contract(float %x, float %y, float %z) {
+; CHECK-LABEL: @fmuladd_unfold_missing_contract(
+; CHECK-NEXT:    [[FMULADD:%.*]] = call reassoc float @llvm.fmuladd.f32(float [[X:%.*]], float [[Y:%.*]], float [[Z:%.*]])
+; CHECK-NEXT:    ret float [[FMULADD]]
+;
+  %fmuladd = call reassoc float @llvm.fmuladd.f32(float %x, float %y, float %z)
+  ret float %fmuladd
+}
+
----------------
andykaylor wrote:

> > That is, without any fast-math flags this
> > `%r = call float @llvm.fmuladd(float %b, float %c, float %d)`
> > is semantically equivalent to this:
> > ```
> > %t1 = fmul contract float %b, %c
> > %t2 = fadd contract float %t1, %d
> > %r = call float @llvm.arithmetic.fence.f32(float %t2)
> > ```
> 
> Not quite, I think, you also need arithmetic fences on `%b`, `%c`, and `%d`.

Yes, you're right.

> Over the course of this discussion, I think I've leaned more in the direction of not canonicalizing `fmuladd` into `fmul + fadd`, and instead making passes support the instruction as if it were an `fma` (although I note that the motivating issue also seems to fail if you use an `fma` as a reduction).

I'm not sure what you mean by "as if it were an `fma`." If you mean passes must treat the intrinsic as an atomic operation, I'm inclined to agree. If you mean passes must assume that it will have the semantics of a fused operation, then no.

> [I think we have a clear contender for a topic in next week's FP working group meeting!]

Yes! I was thinking it would be good if we could start using the FP working group meeting to dig into your RFC one topic at a time and try to make some progress. This does look like a manageable place to start.

https://github.com/llvm/llvm-project/pull/90434