[PATCH] D104847: [Clang][NVPTX] Add NVPTX intrinsics and builtins for CUDA PTX 6.5 and 7.0 WMMA and MMA instructions

Artem Belevich via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Jun 25 10:43:49 PDT 2021


tra accepted this revision.
tra added a comment.

LGTM. Would you like me to land the patch on your behalf?



================
Comment at: clang/lib/CodeGen/CGBuiltin.cpp:16397
   unsigned NumEltsD;
   std::array<unsigned, 8> Variants;
 
----------------
A comment here describing expected arrangement of the variants here would be helpful.
E.g. `ordered by layout-A/layout-B/satf, where 'row' has priority over 'col' for layout. The index of non-satf variants is expected to match the undocumented layout constants used by CUDA's mma.hpp`.

It could be cleaner if we could use designated initializer, but we can't use them yet until LLVM switches to c++20.



CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D104847/new/

https://reviews.llvm.org/D104847



More information about the llvm-commits mailing list