[PATCH] D100394: [Clang][NVPTX] Add NVPTX intrinsics and builtins for CUDA PTX cp.async instructions
Stuart Adams via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Fri Apr 16 04:05:22 PDT 2021
nyalloc updated this revision to Diff 338059.
nyalloc added a comment.
Updated diff to address review comments.
- mbarrier intrinsics, builtins and tests are now included
- alignment / address space updated for cp.async.mbarrier instructions
- cp_async_wait_all intrinsics and builtins updated to use immediate values
- copy intrinsics updated with appropriate attributes
- copy intrinsics / builtins arguments are updated to use the appropriate address spaces
- missing tests added
1. List Item
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D100394/new/
https://reviews.llvm.org/D100394
Files:
clang/include/clang/Basic/BuiltinsNVPTX.def
clang/test/CodeGen/builtins-nvptx.c
llvm/include/llvm/IR/IntrinsicsNVVM.td
llvm/lib/Target/NVPTX/NVPTXInstrInfo.td
llvm/lib/Target/NVPTX/NVPTXIntrinsics.td
llvm/test/CodeGen/NVPTX/async-copy.ll
llvm/test/CodeGen/NVPTX/mbarrier.ll
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D100394.338059.patch
Type: text/x-patch
Size: 35272 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20210416/de1322da/attachment-0001.bin>
More information about the llvm-commits
mailing list