[PATCH] D104847: [Clang][NVPTX] Add NVPTX intrinsics and builtins for CUDA PTX 6.5 and 7.0 WMMA and MMA instructions
Artem Belevich via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Fri Jun 25 10:43:49 PDT 2021
tra accepted this revision.
tra added a comment.
LGTM. Would you like me to land the patch on your behalf?
================
Comment at: clang/lib/CodeGen/CGBuiltin.cpp:16397
unsigned NumEltsD;
std::array<unsigned, 8> Variants;
----------------
A comment here describing expected arrangement of the variants here would be helpful.
E.g. `ordered by layout-A/layout-B/satf, where 'row' has priority over 'col' for layout. The index of non-satf variants is expected to match the undocumented layout constants used by CUDA's mma.hpp`.
It could be cleaner if we could use designated initializer, but we can't use them yet until LLVM switches to c++20.
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D104847/new/
https://reviews.llvm.org/D104847
More information about the llvm-commits
mailing list