[PATCH] D122044: [AMDGPU] New gfx940 mfma instructions

Stanislav Mekhanoshin via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Mar 21 10:20:14 PDT 2022


rampitec marked an inline comment as done.
rampitec added inline comments.


================
Comment at: clang/include/clang/Basic/BuiltinsAMDGPU.def:308
 
+TARGET_BUILTIN(__builtin_amdgcn_mfma_i32_16x16x32_i8, "V4iWiWiV4iIiIiIi", "nc", "mai-insts")
+TARGET_BUILTIN(__builtin_amdgcn_mfma_i32_32x32x16_i8, "V16iWiWiV16iIiIiIi", "nc", "mai-insts")
----------------
foad wrote:
> Why do the new ones have `_` before the `i8`/`xf32` suffix? None of the old ones have it. What does `xf32` mean?
xf32 suffix indicates data in TF32 format: 8-bit exponent with FP16’s 10-bit mantissa. Uses 32-bits of storage for 19 bits of data.

MFMA instructions got new names in the gfx940 (see D121741). It now includes block size which was implicit before. That is because some of them operate on a 2x-16x blocks. Together adding an 'x' suffix and new size factor made the names ambiguous so the decision was made to separate fields with an underscore. The old names are preserved as aliases for compatibility with the existing programs. The same is true for the builtins as these are used in the existing programs. Going forward a new names will be used.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D122044/new/

https://reviews.llvm.org/D122044



More information about the llvm-commits mailing list