[flang-commits] [PATCH] D132164: [flang] Use naive algorithm for folding complex division when it doesn't over/underflow
Peter Klausler via Phabricator via flang-commits
flang-commits at lists.llvm.org
Thu Aug 18 11:25:56 PDT 2022
klausler created this revision.
klausler added a reviewer: vdonaldson.
klausler added a project: Flang.
Herald added a subscriber: jdoerfert.
Herald added a project: All.
klausler requested review of this revision.
f18 unconditionally uses a scaling algorithm for complex/complex division
that avoids needless overflows and underflows when computing the sum of
the squares of the components of the denominator -- but testing has shown
some 1 ULP differences relative to the naive calculation due to the
extra operations and roundings. So use the scaling algorithm only when
the naive calculation actually would overflow or underflow.
https://reviews.llvm.org/D132164
Files:
flang/lib/Evaluate/complex.cpp
Index: flang/lib/Evaluate/complex.cpp
===================================================================
--- flang/lib/Evaluate/complex.cpp
+++ flang/lib/Evaluate/complex.cpp
@@ -47,11 +47,30 @@
ValueWithRealFlags<Complex<R>> Complex<R>::Divide(
const Complex &that, Rounding rounding) const {
// (a + ib)/(c + id) -> [(a+ib)*(c-id)] / [(c+id)*(c-id)]
- // -> [ac+bd+i(bc-ad)] / (cc+dd)
+ // -> [ac+bd+i(bc-ad)] / (cc+dd) -- note (cc+dd) is real
// -> ((ac+bd)/(cc+dd)) + i((bc-ad)/(cc+dd))
- // but to avoid overflows, scale by d/c if c>=d, else c/d
- Part scale; // <= 1.0
RealFlags flags;
+ Part cc{that.re_.Multiply(that.re_, rounding).AccumulateFlags(flags)};
+ Part dd{that.im_.Multiply(that.im_, rounding).AccumulateFlags(flags)};
+ Part ccPdd{cc.Add(dd, rounding).AccumulateFlags(flags)};
+ if (!flags.test(RealFlag::Overflow) && !flags.test(RealFlag::Underflow)) {
+ // den = (cc+dd) did not overflow or underflow; try the naive
+ // sequence without scaling to avoid extra roundings.
+ Part ac{re_.Multiply(that.re_, rounding).AccumulateFlags(flags)};
+ Part ad{re_.Multiply(that.im_, rounding).AccumulateFlags(flags)};
+ Part bc{im_.Multiply(that.re_, rounding).AccumulateFlags(flags)};
+ Part bd{im_.Multiply(that.im_, rounding).AccumulateFlags(flags)};
+ Part acPbd{ac.Add(bd, rounding).AccumulateFlags(flags)};
+ Part bcSad{bc.Subtract(ad, rounding).AccumulateFlags(flags)};
+ Part re{acPbd.Divide(ccPdd, rounding).AccumulateFlags(flags)};
+ Part im{bcSad.Divide(ccPdd, rounding).AccumulateFlags(flags)};
+ if (!flags.test(RealFlag::Overflow) && !flags.test(RealFlag::Underflow)) {
+ return {Complex{re, im}, flags};
+ }
+ }
+ // Scale numerator and denominator by d/c (if c>=d) or c/d (if c<d)
+ flags.clear();
+ Part scale; // will be <= 1.0 in magnitude
bool cGEd{that.re_.ABS().Compare(that.im_.ABS()) != Relation::Less};
if (cGEd) {
scale = that.im_.Divide(that.re_, rounding).AccumulateFlags(flags);
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D132164.453733.patch
Type: text/x-patch
Size: 2007 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/flang-commits/attachments/20220818/69ed62e0/attachment-0001.bin>
More information about the flang-commits
mailing list