[PATCH] D100394: [Clang][NVPTX] Add NVPTX intrinsics and builtins for CUDA PTX cp.async instructions
Stuart Adams via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Apr 13 10:03:49 PDT 2021
nyalloc created this revision.
nyalloc added reviewers: tra, jholewinski, jdoerfert.
nyalloc added projects: clang, LLVM.
Herald added subscribers: hiraditya, yaxunl.
nyalloc requested review of this revision.
Herald added a subscriber: llvm-commits.
Adds NVPTX builtins and intrinsics for the CUDA PTX `cp.async` instructions for `sm_80` architecture or newer.
PTX ISA description of cp.async: https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#data-movement-and-conversion-instructions-asynchronous-copy https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#parallel-synchronization-and-communication-instructions-cp-async-mbarrier-arrive
Authored-by: Stuart Adams <stuart.adams at codeplay.com>
Co-Authored-by: Alexander Johnston <alexander at codeplay.com>
https://reviews.llvm.org/D100394
Files:
clang/include/clang/Basic/BuiltinsNVPTX.def
clang/test/CodeGen/builtins-nvptx.c
llvm/include/llvm/IR/IntrinsicsNVVM.td
llvm/lib/Target/NVPTX/NVPTXInstrInfo.td
llvm/lib/Target/NVPTX/NVPTXIntrinsics.td
llvm/test/CodeGen/NVPTX/async-copy.ll
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D100394.337192.patch
Type: text/x-patch
Size: 14437 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20210413/1528e154/attachment.bin>
More information about the llvm-commits
mailing list