[llvm] [IR][Float8] Add two kinds float8 IR type (PR #89900)
Joshua Cranmer via llvm-commits
llvm-commits at lists.llvm.org
Wed Apr 24 11:01:11 PDT 2024
================
@@ -3871,9 +3879,9 @@ Floating-Point Types
* - ``ppc_fp128``
- 128-bit floating-point value (two 64-bits)
-The binary format of half, float, double, and fp128 correspond to the
-IEEE-754-2008 specifications for binary16, binary32, binary64, and binary128
-respectively.
+The binary format of float8e5m2, half, float, double, and fp128 correspond
+to the IEEE-754-2008 specifications for binary8, binary16, binary32, binary64,
----------------
jcranmer-intel wrote:
IEEE 754-2008 (nor IEEE 754-2019, for that matter) doesn't define a binary8 type. And the table for binaryk actually applies to k >= 128 and k % 32 == 0. And even if you were to ignore that restriction, the formulas you would get for k =8 gives a p of 9 bits and a w of -1 bits.
https://github.com/llvm/llvm-project/pull/89900
More information about the llvm-commits
mailing list