[PATCH] D123147: [X86] Enable enableAggressiveFMAFusion to true for FMA capable targets (PR36826)

Tue Apr 5 11:25:33 PDT 2022

craig.topper added inline comments.

================
Comment at: llvm/test/CodeGen/X86/dag-fmf-cse.ll:6
 ; should be recognized as a factor in the last fsub, so we should
 ; see a mul and add, not a mul and fma:
 ; a * b - (-a * b) ---> (a * b) + (a * b)
----------------
This comment needs to be updated.

================
Comment at: llvm/test/CodeGen/X86/dag-fmf-cse.ll:12
 ; CHECK:       # %bb.0:
-; CHECK-NEXT:    vmulss %xmm1, %xmm0, %xmm0
-; CHECK-NEXT:    vaddss %xmm0, %xmm0, %xmm0
+; CHECK-NEXT:    vmulss %xmm1, %xmm0, %xmm2
+; CHECK-NEXT:    vfmadd213ss {{.*#+}} xmm0 = (xmm1 * xmm0) + xmm2
----------------
I guess this was the flaw you were referring to?

This looks like a regression on Haswell and Broadwell.

Latencies
| | HSW | BDW | SKL |
| vaddss | 3 | 3 | 4 |
| vmulss | 5 | 3 | 4 |
| fma | 5 | 5 | 4 |

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D123147/new/

https://reviews.llvm.org/D123147