[llvm] [InstCombine] Transform high latency, dependent FSQRT/FDIV into FMUL (PR #87474)

Thu Apr 18 00:25:01 PDT 2024

================
@@ -705,19 +828,20 @@ Instruction *InstCombinerImpl::foldFMulReassoc(BinaryOperator &I) {
   // has the necessary (reassoc) fast-math-flags.
   if (I.hasNoSignedZeros() &&
       match(Op0, (m_FDiv(m_SpecificFP(1.0), m_Value(Y)))) &&
-      match(Y, m_Sqrt(m_Value(X))) && Op1 == X)
+      match(Y, m_Sqrt(m_Value(X))) && Op1 == X && !delayFMulSqrtTransform(Op0))
----------------
sushgokh wrote:

yes(by pattern matching fdiv instead of fmul ) unless that fmul fold is transformed again before this transform kicks in. Let me try

https://github.com/llvm/llvm-project/pull/87474