[PATCH] D151923: [APFloat] Add APFloat semantic support for TF32

Fri Jun 2 11:43:06 PDT 2023

majnemer added inline comments.

================
Comment at: llvm/include/llvm/ADT/APFloat.h:190
+    // greater throughput than single precision (32-bit) formats.
+    S_FloatTF32,

----------------
Hmm,  this says improved precision than half but the semantics you gave say 11 digits? Does NVIDIA document how many bits we should expect?

================
Comment at: llvm/lib/Support/APFloat.cpp:141
     4, -10, 4, 8, fltNonfiniteBehavior::NanOnly, fltNanEncoding::NegativeZero};
+static constexpr fltSemantics semFloatTF32 = {127, -126, 11, 19};
 static constexpr fltSemantics semX87DoubleExtended = {16383, -16382, 64, 80};
----------------
NVIDIA's [docs](https://docs.nvidia.com/cuda/parallel-thread-execution/#alternate-floating-point-data-formats) say:
> This data format is a special 32-bit floating point format supported by the matrix multiply-and-accumulate instructions, with the same range as .f32 and reduced precision (>=10 bits). The internal layout of tf32 format is implementation defined. PTX facilitates conversion from single precision .f32 type to tf32 format. A register variable containing tf32 data must be declared with .b32 type.

As written, it's at least 11 bits but it can change over time. Will we need corresponding flavors of this for future architectures over time?

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D151923/new/

https://reviews.llvm.org/D151923