[llvm] [InstCombine] Transform high latency, dependent FSQRT/FDIV into FMUL (PR #87474)

Sun Jan 12 23:24:01 PST 2025

================
@@ -0,0 +1,631 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 4
+; RUN: opt -S -passes='instcombine<no-verify-fixpoint>' < %s | FileCheck %s
+
+ at x = global double 0.000000e+00
+ at r1 = global double 0.000000e+00
+ at r2 = global double 0.000000e+00
+ at r3 = global double 0.000000e+00
+ at v = global [2 x double] zeroinitializer
+ at v1 = global [2 x double] zeroinitializer
+ at v2 = global [2 x double] zeroinitializer
+
+; div/mul/div1 in the same block.
+define void @bb_constraint_case1(double %a) {
+; CHECK-LABEL: define void @bb_constraint_case1(
+; CHECK-SAME: double [[A:%.*]]) {
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    [[SQRT1:%.*]] = call reassoc double @llvm.sqrt.f64(double [[A]])
+; CHECK-NEXT:    [[TMP0:%.*]] = fdiv reassoc double 1.000000e+00, [[A]]
+; CHECK-NEXT:    [[DIV:%.*]] = fmul reassoc ninf arcp double [[TMP0]], [[SQRT1]]
----------------
sushgokh wrote:

I think it should just be 
`FastMathFlags NewFMF = FastMathFlags::intersectRewrite(Op1FMF, Op2FMF)`
since this is multiplication operation, right?


https://github.com/llvm/llvm-project/pull/87474