[llvm] [InstCombine] Transform high latency, dependent FSQRT/FDIV into FMUL (PR #87474)

Wed Apr 17 23:40:55 PDT 2024

================
@@ -705,19 +828,20 @@ Instruction *InstCombinerImpl::foldFMulReassoc(BinaryOperator &I) {
   // has the necessary (reassoc) fast-math-flags.
   if (I.hasNoSignedZeros() &&
       match(Op0, (m_FDiv(m_SpecificFP(1.0), m_Value(Y)))) &&
-      match(Y, m_Sqrt(m_Value(X))) && Op1 == X)
+      match(Y, m_Sqrt(m_Value(X))) && Op1 == X && !delayFMulSqrtTransform(Op0))
----------------
nikic wrote:

Is it possible to make the transform work on the IR after the fmul fold instead of the one before?

https://github.com/llvm/llvm-project/pull/87474