[llvm] [APFloat] Add APFloat support for E8M0 type (PR #107127)
Durgadoss R via llvm-commits
llvm-commits at lists.llvm.org
Thu Sep 5 11:58:05 PDT 2024
================
@@ -195,6 +195,12 @@ struct APFloatBase {
// improved range compared to half (16-bit) formats, at (potentially)
// greater throughput than single precision (32-bit) formats.
S_FloatTF32,
+ // 8-bit floating point number with (all the) 8 bits for the exponent
+ // like in FP32. There are no zeroes, no infinities, and no denormal values.
+ // NaN is represented with all bits set to 1. Bias is 127.
+ // This represents the scale data type in the MX specification from
+ // https://www.opencompute.org/documents/ocp-microscaling-formats-mx-v1-0-spec-final-pdf
+ S_Float8E8M0FN,
----------------
durga4github wrote:
oh, the FN means "Finite". This is consistent with what's being used for other similar types here.
https://github.com/llvm/llvm-project/pull/107127
More information about the llvm-commits
mailing list