[llvm] [InstCombine] Optimize `sinh` and `cosh` divisions (PR #81433)

Thu Feb 29 16:42:00 PST 2024

================
@@ -0,0 +1,114 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
+; RUN: opt -S -passes=instcombine < %s | FileCheck %s
+
+define double @fdiv_cosh_sinh(double %a) {
+; CHECK-LABEL: @fdiv_cosh_sinh(
+; CHECK-NEXT:    [[TMP1:%.*]] = call double @cosh(double [[A:%.*]])
+; CHECK-NEXT:    [[TMP2:%.*]] = call double @sinh(double [[A]])
+; CHECK-NEXT:    [[DIV:%.*]] = fdiv double [[TMP1]], [[TMP2]]
+; CHECK-NEXT:    ret double [[DIV]]
+;
+  %1 = call double @cosh(double %a)
+  %2 = call double @sinh(double %a)
+  %div = fdiv double %1, %2
+  ret double %div
+}
+
+define double @fdiv_strict_cosh_strict_sinh_reassoc(double %a) {
+; CHECK-LABEL: @fdiv_strict_cosh_strict_sinh_reassoc(
+; CHECK-NEXT:    [[TMP1:%.*]] = call double @cosh(double [[A:%.*]])
+; CHECK-NEXT:    [[TMP2:%.*]] = call reassoc double @sinh(double [[A]])
+; CHECK-NEXT:    [[DIV:%.*]] = fdiv double [[TMP1]], [[TMP2]]
+; CHECK-NEXT:    ret double [[DIV]]
+;
+  %1 = call double @cosh(double %a)
+  %2 = call reassoc double @sinh(double %a)
+  %div = fdiv double %1, %2
+  ret double %div
+}
+
+define double @fdiv_reassoc_cosh_strict_sinh_strict(double %a, ptr dereferenceable(2) %dummy) {
+; CHECK-LABEL: @fdiv_reassoc_cosh_strict_sinh_strict(
+; CHECK-NEXT:    [[TANH:%.*]] = call reassoc double @tanh(double [[A]])
+; CHECK-NEXT:    [[DIV:%.*]] = fdiv reassoc double 1.000000e+00, [[TANH]]
+; CHECK-NEXT:    ret double [[DIV]]
+;
+  %1 = call double @cosh(double %a)
+  %2 = call double @sinh(double %a)
+  %div = fdiv reassoc double %1, %2
----------------
andykaylor wrote:

I know you copied this from an existing optimization, but it's not clear to me why `reassoc` is the flag that enables this and why it only needs to be set on the `fdiv` instruction. It seems more like `afn`?

I looked at the code review for the previous optimization (https://reviews.llvm.org/D41286), and there it was "shown" that for a brute force evaluation of all single precision values sin(x)/cos(x) was equivalent to tan(x). I tested it myself and it turns out that the equivalence was there because the code being used for the test was calling 64-bit versions of the functions then truncating them to 32-bit. If you call the 32-bit versions, there are many values for which there is a minor numeric difference.

I did a similar test with random values, testing both cosh(x)/sinh(x) and sinh(x)/cosh(x), and it looks like this transformation is going to introduce about a 2 ULP difference in many cases. That's perfectly reasonable for fast-math. My only objection is which flags we're checking.

https://github.com/llvm/llvm-project/pull/81433