[llvm] APFloat: fix wrong result status for large floats (PR #189925)

Wed Apr 1 02:50:26 PDT 2026

https://github.com/LucaCiucci created https://github.com/llvm/llvm-project/pull/189925

For large float literals such as `10384593717069655257060992658440193.0`, [`FloatingLiteral::isExact`](https://github.com/llvm/llvm-project/blob/6b2b0da40de1495ace2b100799a35711f7ad7b21/clang/include/clang/AST/Expr.h#L1702) was incorrectly returning `true`.

The issue has been tracked down to `IEEEFloat::roundSignificandWithExponent` incorrectly reporting `opOK`.




>From 9a214d255e78c3af965fead2d8ea4ccb6152ccf0 Mon Sep 17 00:00:00 2001
From: Luca Ciucci <luca.ciucci99 at gmail.com>
Date: Wed, 1 Apr 2026 10:27:57 +0200
Subject: [PATCH] APFloat: fix wrong isExact for large floats

For large float literals such as 10384593717069655257060992658440193.0,
`FloatingLiteral::isExact` was previously returning `true` while this
value is not exaclty representable in any FP precision up to quadruple.

The reason is that `IEEEFloat::roundSignificandWithExponent` could
previously incorrectly report `opOK` (exact).

The issue was that the conversion proceeds in two phases:
1. The decimal significand is first converted into an intermediate
   binary representation (`decSig`) at a higher precision.
   This step can already be inexact (tracked via `sigStatus` /
   `powStatus`).
2. The intermediate result is then truncated to the target precision,
   and the final status is computed solely from the lost fraction of
   this truncation:
   ```c++
   return normalize(rounding_mode, calcLostFraction);
   ```

If the intermediate rounding happens to produce a value that is exactly
representable at the target precision (e.g. because low bits were
already discarded earlier), then `calcLostFraction` can be zero, and the
final status is reported as `opOK`, even though information was lost
during step (1).

Fix this by tracking `opInexact` across the whole conversion: we
accumulate inexactness from `convertFromUnsignedParts` (both for the
significand and the power-of-5 factor), and OR it into the final status
returned by `normalize()`.

In this way any loss of information during the conversion pipeline is
reflected in the final `opStatus`.
---
 llvm/lib/Support/APFloat.cpp       |  8 +++++++-
 llvm/unittests/ADT/APFloatTest.cpp | 21 +++++++++++++++++++++
 2 files changed, 28 insertions(+), 1 deletion(-)

diff --git a/llvm/lib/Support/APFloat.cpp b/llvm/lib/Support/APFloat.cpp
index 47c712125f044..dcf503ec7779c 100644
--- a/llvm/lib/Support/APFloat.cpp
+++ b/llvm/lib/Support/APFloat.cpp
@@ -2858,6 +2858,7 @@ IEEEFloat::roundSignificandWithExponent(const integerPart *decSigParts,
   for (;; parts *= 2) {
     opStatus sigStatus, powStatus;
     unsigned int excessPrecision, truncatedBits;
+    opStatus conversionStatus = opOK;
 
     calcSemantics.precision = parts * integerPartWidth - 1;
     excessPrecision = calcSemantics.precision - semantics->precision;
@@ -2869,8 +2870,12 @@ IEEEFloat::roundSignificandWithExponent(const integerPart *decSigParts,
 
     sigStatus = decSig.convertFromUnsignedParts(decSigParts, sigPartCount,
                                                 rmNearestTiesToEven);
+    conversionStatus =
+        static_cast<opStatus>(conversionStatus | (sigStatus & opInexact));
     powStatus = pow5.convertFromUnsignedParts(pow5Parts, pow5PartCount,
                                               rmNearestTiesToEven);
+    conversionStatus =
+        static_cast<opStatus>(conversionStatus | (powStatus & opInexact));
     /* Add exp, as 10^n = 5^n * 2^n.  */
     decSig.exponent += exp;
 
@@ -2917,7 +2922,8 @@ IEEEFloat::roundSignificandWithExponent(const integerPart *decSigParts,
       calcLostFraction = lostFractionThroughTruncation(decSig.significandParts(),
                                                        decSig.partCount(),
                                                        truncatedBits);
-      return normalize(rounding_mode, calcLostFraction);
+      return static_cast<opStatus>(normalize(rounding_mode, calcLostFraction) |
+                                   conversionStatus);
     }
   }
 }
diff --git a/llvm/unittests/ADT/APFloatTest.cpp b/llvm/unittests/ADT/APFloatTest.cpp
index 8ff3efe64c29b..1cf015c07e10f 100644
--- a/llvm/unittests/ADT/APFloatTest.cpp
+++ b/llvm/unittests/ADT/APFloatTest.cpp
@@ -10206,4 +10206,25 @@ TEST(APFloatTest, isValidArbitraryFPFormat) {
   EXPECT_FALSE(APFloat::isValidArbitraryFPFormat("unknown"));
 }
 
+TEST(APFloatTest, DecimalStringPreservesInexactStatus) {
+  APFloat F(APFloat::IEEEsingle());
+
+  auto StatusOr = F.convertFromString("10384593717069655257060992658440193.0",
+                                      APFloat::rmNearestTiesToEven);
+  ASSERT_TRUE(!!StatusOr);
+
+  APFloat::opStatus Status = *StatusOr;
+
+  // The value is 2^113 + 1, not exactly representable in float.
+  EXPECT_TRUE(Status & APFloat::opInexact);
+
+  // But it should round to exactly 2^113.
+  APFloat Expected(APFloat::IEEEsingle());
+  auto ExpectedStatus =
+      Expected.convertFromString("0x1p113", APFloat::rmNearestTiesToEven);
+  ASSERT_TRUE(!!ExpectedStatus);
+
+  EXPECT_EQ(F.bitcastToAPInt(), Expected.bitcastToAPInt());
+}
+
 } // namespace