[all-commits] [llvm/llvm-project] 02c246: [Clang][NVPTX] Add NVPTX intrinsics and builtins f...

Steffen Larsen via All-commits all-commits at lists.llvm.org
Mon May 17 09:47:55 PDT 2021


  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: 02c2468864bbb37f7b279aff84961815c1500b6c
      https://github.com/llvm/llvm-project/commit/02c2468864bbb37f7b279aff84961815c1500b6c
  Author: Stuart Adams <stuart.adams at codeplay.com>
  Date:   2021-05-17 (Mon, 17 May 2021)

  Changed paths:
    M clang/include/clang/Basic/BuiltinsNVPTX.def
    M clang/test/CodeGen/builtins-nvptx.c
    M llvm/include/llvm/IR/IntrinsicsNVVM.td
    M llvm/lib/Target/NVPTX/NVPTXInstrInfo.td
    M llvm/lib/Target/NVPTX/NVPTXIntrinsics.td
    A llvm/test/CodeGen/NVPTX/async-copy.ll
    A llvm/test/CodeGen/NVPTX/mbarrier.ll

  Log Message:
  -----------
  [Clang][NVPTX] Add NVPTX intrinsics and builtins for CUDA PTX cp.async instructions

Adds NVPTX builtins and intrinsics for the CUDA PTX `cp.async` instructions for
`sm_80` architecture or newer.

PTX ISA description of `cp.async`:
https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#data-movement-and-conversion-instructions-asynchronous-copy
https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#parallel-synchronization-and-communication-instructions-cp-async-mbarrier-arrive

Authored-by: Stuart Adams <stuart.adams at codeplay.com>
Co-Authored-by: Alexander Johnston <alexander at codeplay.com>

Differential Revision: https://reviews.llvm.org/D100394


  Commit: f226e28a880f8e40b1bfd4c77b9768a667372d22
      https://github.com/llvm/llvm-project/commit/f226e28a880f8e40b1bfd4c77b9768a667372d22
  Author: Steffen Larsen <steffen.larsen at codeplay.com>
  Date:   2021-05-17 (Mon, 17 May 2021)

  Changed paths:
    M clang/include/clang/Basic/BuiltinsNVPTX.def
    A clang/test/CodeGenCUDA/redux-builtins.cu
    M llvm/include/llvm/IR/IntrinsicsNVVM.td
    M llvm/lib/Target/NVPTX/NVPTXIntrinsics.td
    A llvm/test/CodeGen/NVPTX/redux-sync.ll

  Log Message:
  -----------
  [Clang][NVPTX] Add NVPTX intrinsics and builtins for CUDA PTX redux.sync instructions

Adds NVPTX builtins and intrinsics for the CUDA PTX `redux.sync` instructions
for `sm_80` architecture or newer.

PTX ISA description of `redux.sync`:
https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#parallel-synchronization-and-communication-instructions-redux-sync

Authored-by: Steffen Larsen <steffen.larsen at codeplay.com>

Differential Revision: https://reviews.llvm.org/D100124


Compare: https://github.com/llvm/llvm-project/compare/1417ddafdb68...f226e28a880f


More information about the All-commits mailing list