[llvm] APFloat: fix wrong result status for large floats (PR #189925)
Luca Ciucci via llvm-commits
llvm-commits at lists.llvm.org
Wed Apr 1 02:50:26 PDT 2026
https://github.com/LucaCiucci created https://github.com/llvm/llvm-project/pull/189925
For large float literals such as `10384593717069655257060992658440193.0`, [`FloatingLiteral::isExact`](https://github.com/llvm/llvm-project/blob/6b2b0da40de1495ace2b100799a35711f7ad7b21/clang/include/clang/AST/Expr.h#L1702) was incorrectly returning `true`.
The issue has been tracked down to `IEEEFloat::roundSignificandWithExponent` incorrectly reporting `opOK`.
>From 9a214d255e78c3af965fead2d8ea4ccb6152ccf0 Mon Sep 17 00:00:00 2001
From: Luca Ciucci <luca.ciucci99 at gmail.com>
Date: Wed, 1 Apr 2026 10:27:57 +0200
Subject: [PATCH] APFloat: fix wrong isExact for large floats
For large float literals such as 10384593717069655257060992658440193.0,
`FloatingLiteral::isExact` was previously returning `true` while this
value is not exaclty representable in any FP precision up to quadruple.
The reason is that `IEEEFloat::roundSignificandWithExponent` could
previously incorrectly report `opOK` (exact).
The issue was that the conversion proceeds in two phases:
1. The decimal significand is first converted into an intermediate
binary representation (`decSig`) at a higher precision.
This step can already be inexact (tracked via `sigStatus` /
`powStatus`).
2. The intermediate result is then truncated to the target precision,
and the final status is computed solely from the lost fraction of
this truncation:
```c++
return normalize(rounding_mode, calcLostFraction);
```
If the intermediate rounding happens to produce a value that is exactly
representable at the target precision (e.g. because low bits were
already discarded earlier), then `calcLostFraction` can be zero, and the
final status is reported as `opOK`, even though information was lost
during step (1).
Fix this by tracking `opInexact` across the whole conversion: we
accumulate inexactness from `convertFromUnsignedParts` (both for the
significand and the power-of-5 factor), and OR it into the final status
returned by `normalize()`.
In this way any loss of information during the conversion pipeline is
reflected in the final `opStatus`.
---
llvm/lib/Support/APFloat.cpp | 8 +++++++-
llvm/unittests/ADT/APFloatTest.cpp | 21 +++++++++++++++++++++
2 files changed, 28 insertions(+), 1 deletion(-)
diff --git a/llvm/lib/Support/APFloat.cpp b/llvm/lib/Support/APFloat.cpp
index 47c712125f044..dcf503ec7779c 100644
--- a/llvm/lib/Support/APFloat.cpp
+++ b/llvm/lib/Support/APFloat.cpp
@@ -2858,6 +2858,7 @@ IEEEFloat::roundSignificandWithExponent(const integerPart *decSigParts,
for (;; parts *= 2) {
opStatus sigStatus, powStatus;
unsigned int excessPrecision, truncatedBits;
+ opStatus conversionStatus = opOK;
calcSemantics.precision = parts * integerPartWidth - 1;
excessPrecision = calcSemantics.precision - semantics->precision;
@@ -2869,8 +2870,12 @@ IEEEFloat::roundSignificandWithExponent(const integerPart *decSigParts,
sigStatus = decSig.convertFromUnsignedParts(decSigParts, sigPartCount,
rmNearestTiesToEven);
+ conversionStatus =
+ static_cast<opStatus>(conversionStatus | (sigStatus & opInexact));
powStatus = pow5.convertFromUnsignedParts(pow5Parts, pow5PartCount,
rmNearestTiesToEven);
+ conversionStatus =
+ static_cast<opStatus>(conversionStatus | (powStatus & opInexact));
/* Add exp, as 10^n = 5^n * 2^n. */
decSig.exponent += exp;
@@ -2917,7 +2922,8 @@ IEEEFloat::roundSignificandWithExponent(const integerPart *decSigParts,
calcLostFraction = lostFractionThroughTruncation(decSig.significandParts(),
decSig.partCount(),
truncatedBits);
- return normalize(rounding_mode, calcLostFraction);
+ return static_cast<opStatus>(normalize(rounding_mode, calcLostFraction) |
+ conversionStatus);
}
}
}
diff --git a/llvm/unittests/ADT/APFloatTest.cpp b/llvm/unittests/ADT/APFloatTest.cpp
index 8ff3efe64c29b..1cf015c07e10f 100644
--- a/llvm/unittests/ADT/APFloatTest.cpp
+++ b/llvm/unittests/ADT/APFloatTest.cpp
@@ -10206,4 +10206,25 @@ TEST(APFloatTest, isValidArbitraryFPFormat) {
EXPECT_FALSE(APFloat::isValidArbitraryFPFormat("unknown"));
}
+TEST(APFloatTest, DecimalStringPreservesInexactStatus) {
+ APFloat F(APFloat::IEEEsingle());
+
+ auto StatusOr = F.convertFromString("10384593717069655257060992658440193.0",
+ APFloat::rmNearestTiesToEven);
+ ASSERT_TRUE(!!StatusOr);
+
+ APFloat::opStatus Status = *StatusOr;
+
+ // The value is 2^113 + 1, not exactly representable in float.
+ EXPECT_TRUE(Status & APFloat::opInexact);
+
+ // But it should round to exactly 2^113.
+ APFloat Expected(APFloat::IEEEsingle());
+ auto ExpectedStatus =
+ Expected.convertFromString("0x1p113", APFloat::rmNearestTiesToEven);
+ ASSERT_TRUE(!!ExpectedStatus);
+
+ EXPECT_EQ(F.bitcastToAPInt(), Expected.bitcastToAPInt());
+}
+
} // namespace
More information about the llvm-commits
mailing list