[llvm] GlobalISel needs fdiv 1 / sqrt(x) to rsq combine (PR #78673)

Matt Arsenault via llvm-commits llvm-commits at lists.llvm.org
Thu Jan 25 00:13:01 PST 2024


================
@@ -33,6 +33,14 @@ def rcp_sqrt_to_rsq : GICombineRule<
          [{ return matchRcpSqrtToRsq(*${rcp}, ${matchinfo}); }]),
   (apply [{ Helper.applyBuildFn(*${rcp}, ${matchinfo}); }])>;
 
+def fdiv_1_by_sqrt_to_rsq : GICombineRule<
+  (defs root:$root),
+  (match (G_FSQRT $sqrt, $x, (MIFlags FmContract)),
+         (G_FCONSTANT $one, $fpimm),
+         (G_FDIV $dst, $one, $sqrt, (MIFlags FmContract)):$root,
+         [{ return ${fpimm}.getFPImm()->isExactlyValue(1.0)
+            || ${fpimm}.getFPImm()->isExactlyValue(-1.0); }]),
----------------
arsenm wrote:

Yes, we do try to fold that in other places. AMDGPUCodeGenPrepare handles most of these folds, it just leaves the fastest math paths for codegen.

https://github.com/llvm/llvm-project/pull/78673


More information about the llvm-commits mailing list