[llvm] LangRef: state explicitly that floats generally behave according to IEEE-754 (PR #102140)

Tue Aug 20 14:51:09 PDT 2024

================
@@ -3572,6 +3572,29 @@ or ``syncscope("<target-scope>")`` *synchronizes with* and participates in the
 seq\_cst total orderings of other operations that are not marked
 ``syncscope("singlethread")`` or ``syncscope("<target-scope>")``.
 
+.. _floatsem:
+
+Floating-Point Semantics
+------------------------
+
+LLVM floating-point types fall into two categories:
+
+- half, float, double, and fp128, which correspond to the binary16, binary32,
+  binary64, and binary128 formats described in the IEEE-754 specification.
+- The remaining types, which do not directly correspond to a standard IEEE
+  format.
+
+For types that do correspond to an IEEE format, LLVM IR float operations behave
+like the corresponding operations in IEEE-754, with two exceptions: LLVM makes
+:ref:`specific assumptions about the state of the floating-point environment
+<floatenv>` and it implements :ref:`different rules for operations that return
+NaN values <floatnan>`.
+
+This means that optimizations and backends cannot change the precision of these
+operations (unless there are fast-math flags), and frontends can rely on these
+operations deterministically providing perfectly rounded results as described
+in the standard (except when a NaN is returned).
----------------
jcranmer-intel wrote:

I would add "This also means that backends are not allowed to implement floating-point instructions using larger floating-point types unless they take care to consistently narrow the results back to the original range without inducing double-rounding." or some similar text that makes it clear that mapping `fadd float` via just an x87 `FADD` instruction is not legal lowering.

https://github.com/llvm/llvm-project/pull/102140