[llvm] [IR][AsmParser] Revamp how floating-point literals work in LLVM IR. (PR #121838)

Wed Jan 15 08:36:45 PST 2025

================
@@ -4608,31 +4610,40 @@ Simple Constants
     The identifier '``none``' is recognized as an empty token constant
     and must be of :ref:`token type <t_token>`.
 
-The one non-intuitive notation for constants is the hexadecimal form of
-floating-point constants. For example, the form
-'``double    0x432ff973cafa8000``' is equivalent to (but harder to read
-than) '``double 4.5e+15``'. The only time hexadecimal floating-point
-constants are required (and the only time that they are generated by the
-disassembler) is when a floating-point constant must be emitted but it
-cannot be represented as a decimal floating-point number in a reasonable
-number of digits. For example, NaN's, infinities, and other special
-values are represented in their IEEE hexadecimal format so that assembly
-and disassembly do not cause any bits to change in the constants.
-
-When using the hexadecimal form, constants of types bfloat, half, float, and
-double are represented using the 16-digit form shown above (which matches the
-IEEE754 representation for double); bfloat, half and float values must, however,
-be exactly representable as bfloat, IEEE 754 half, and IEEE 754 single
-precision respectively. Hexadecimal format is always used for long double, and
-there are three forms of long double. The 80-bit format used by x86 is
-represented as ``0xK`` followed by 20 hexadecimal digits. The 128-bit format
-used by PowerPC (two adjacent doubles) is represented by ``0xM`` followed by 32
-hexadecimal digits. The IEEE 128-bit format is represented by ``0xL`` followed
-by 32 hexadecimal digits. Long doubles will only work if they match the long
-double format on your target.  The IEEE 16-bit format (half precision) is
-represented by ``0xH`` followed by 4 hexadecimal digits. The bfloat 16-bit
-format is represented by ``0xR`` followed by 4 hexadecimal digits. All
-hexadecimal formats are big-endian (sign bit at the left).
+Floating-point constants support the following kinds of strings:
+
+   +----------------+---------------------------------------------------+
+   | Syntax         | Description                                       |
+   +================+===================================================+
+   | ``+4.5e-13``   | Common decimal literal. Signs are optional, as is |
+   |                | the exponent portion. The decimal point is        |
+   |                | required, as is one or more leading digits before |
+   |                | the decimal point.                                |
+   +----------------+---------------------------------------------------+
+   | ``-0x1.fp13``  | Common hexadecimal literal. Signs are optional.   |
+   |                | The decimal point is required, as is the exponent |
+   |                | portion of the literal (after the ``p``).         |
+   +----------------+---------------------------------------------------+
+   | ``+inf``,      | Positive or negative infinity. The sign is        |
+   | ``-inf``       | required.                                         |
+   +----------------+---------------------------------------------------+
+   | ``+qnan``,     | Positive or negative preferred quiet NaN, i.e.,   |
+   | ``-qnan``      | the quiet bit is set, and all other payload bits  |
+   |                | are 0. The sign is required.                      |
+   +----------------+---------------------------------------------------+
+   | ``+nan(0x1)``  | qNaN value with a particular payload, specified   |
+   |                | as hexadecimal (not including the quiet bit as    |
+   |                | part of the payload). The sign is required.       |
+   +----------------+---------------------------------------------------+
+   | ``+snan(0x1)`` | sNaN value with a particular payload, specified   |
----------------
jcranmer-intel wrote:

The original proposal was `qnan` for the preferred qNaN, and `nan(...)` for every other NaN value. I discovered last night that `APFloat::convertFromString` didn't allow `nan(...)` to produce an sNaN value, and after staring at the IEEE 754 and C23 specifications for a bit to look at what they wanted for string->NaN conversions, I concluded that it was better to explicitly call out an `snan(...)` string than to make `nan(...)` produce a qNaN value.

There's not much keeping `qnan` from having a payload parameter, except that the `APFloat::convertFromString` doesn't support it. That's changeable, but the IEEE 754 specification I noticed doesn't ever use `qnan` for a qNaN string, so it doesn't entirely feel right to me to change `APFloat::convertFromString` to allow it.

FWIW, I also expect that virtually every NaN in practice ends up being `+qnan` or `-qnan` anyways.

https://github.com/llvm/llvm-project/pull/121838