[all-commits] [llvm/llvm-project] 02c246: [Clang][NVPTX] Add NVPTX intrinsics and builtins f...
Steffen Larsen via All-commits
all-commits at lists.llvm.org
Mon May 17 09:47:55 PDT 2021
Branch: refs/heads/main
Home: https://github.com/llvm/llvm-project
Commit: 02c2468864bbb37f7b279aff84961815c1500b6c
https://github.com/llvm/llvm-project/commit/02c2468864bbb37f7b279aff84961815c1500b6c
Author: Stuart Adams <stuart.adams at codeplay.com>
Date: 2021-05-17 (Mon, 17 May 2021)
Changed paths:
M clang/include/clang/Basic/BuiltinsNVPTX.def
M clang/test/CodeGen/builtins-nvptx.c
M llvm/include/llvm/IR/IntrinsicsNVVM.td
M llvm/lib/Target/NVPTX/NVPTXInstrInfo.td
M llvm/lib/Target/NVPTX/NVPTXIntrinsics.td
A llvm/test/CodeGen/NVPTX/async-copy.ll
A llvm/test/CodeGen/NVPTX/mbarrier.ll
Log Message:
-----------
[Clang][NVPTX] Add NVPTX intrinsics and builtins for CUDA PTX cp.async instructions
Adds NVPTX builtins and intrinsics for the CUDA PTX `cp.async` instructions for
`sm_80` architecture or newer.
PTX ISA description of `cp.async`:
https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#data-movement-and-conversion-instructions-asynchronous-copy
https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#parallel-synchronization-and-communication-instructions-cp-async-mbarrier-arrive
Authored-by: Stuart Adams <stuart.adams at codeplay.com>
Co-Authored-by: Alexander Johnston <alexander at codeplay.com>
Differential Revision: https://reviews.llvm.org/D100394
Commit: f226e28a880f8e40b1bfd4c77b9768a667372d22
https://github.com/llvm/llvm-project/commit/f226e28a880f8e40b1bfd4c77b9768a667372d22
Author: Steffen Larsen <steffen.larsen at codeplay.com>
Date: 2021-05-17 (Mon, 17 May 2021)
Changed paths:
M clang/include/clang/Basic/BuiltinsNVPTX.def
A clang/test/CodeGenCUDA/redux-builtins.cu
M llvm/include/llvm/IR/IntrinsicsNVVM.td
M llvm/lib/Target/NVPTX/NVPTXIntrinsics.td
A llvm/test/CodeGen/NVPTX/redux-sync.ll
Log Message:
-----------
[Clang][NVPTX] Add NVPTX intrinsics and builtins for CUDA PTX redux.sync instructions
Adds NVPTX builtins and intrinsics for the CUDA PTX `redux.sync` instructions
for `sm_80` architecture or newer.
PTX ISA description of `redux.sync`:
https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#parallel-synchronization-and-communication-instructions-redux-sync
Authored-by: Steffen Larsen <steffen.larsen at codeplay.com>
Differential Revision: https://reviews.llvm.org/D100124
Compare: https://github.com/llvm/llvm-project/compare/1417ddafdb68...f226e28a880f
More information about the All-commits
mailing list