[PATCH] D60279: [CUDA] Implemented _[bi]mma* builtins.

Artem Belevich via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Apr 4 11:29:57 PDT 2019


tra created this revision.
tra added reviewers: timshen, jlebar.
Herald added subscribers: llvm-commits, bixia, hiraditya, jholewinski.
Herald added a project: LLVM.

These builtins provide access to the new integer and
sub-integer variants of MMA (matrix multiply-accumulate) instructions
provided by CUDA-10.x on sm_75 (AKA Turing) GPUs.

Also added a feature for PTX 6.4. While Clang/LLVM does not generate
any PTX instructions that need it, we still need to pass it through to
ptxas in order to be able to compile code that uses the new 'mma'
instruction as inline assembly (e.g used by NVIDIA's CUTLASS library
https://github.com/NVIDIA/cutlass/blob/master/cutlass/arch/mma.h#L101)


https://reviews.llvm.org/D60279

Files:
  clang/include/clang/Basic/BuiltinsNVPTX.def
  clang/lib/Basic/Targets/NVPTX.cpp
  clang/lib/CodeGen/CGBuiltin.cpp
  clang/lib/Driver/ToolChains/Cuda.cpp
  clang/test/CodeGen/builtins-nvptx-mma.cu
  clang/test/CodeGen/builtins-nvptx-mma.py
  llvm/lib/Target/NVPTX/NVPTX.td

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D60279.193754.patch
Type: text/x-patch
Size: 103922 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20190404/532873cb/attachment-0001.bin>


More information about the llvm-commits mailing list