[all-commits] [llvm/llvm-project] dfab31: [NVPTX] Add support for maxclusterrank in launch_b...

Jakub Chlanda via All-commits all-commits at lists.llvm.org
Tue Sep 26 23:51:41 PDT 2023


  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: dfab31b41b4988b6dc8129840eba68f0c36c0f13
      https://github.com/llvm/llvm-project/commit/dfab31b41b4988b6dc8129840eba68f0c36c0f13
  Author: Jakub Chlanda <jakub at codeplay.com>
  Date:   2023-09-27 (Wed, 27 Sep 2023)

  Changed paths:
    M clang/include/clang/Basic/Attr.td
    M clang/include/clang/Basic/DiagnosticSemaKinds.td
    M clang/include/clang/Sema/Sema.h
    M clang/lib/Basic/Targets/NVPTX.h
    M clang/lib/CodeGen/Targets/NVPTX.cpp
    M clang/lib/Parse/ParseOpenMP.cpp
    M clang/lib/Sema/SemaDeclAttr.cpp
    M clang/lib/Sema/SemaTemplateInstantiateDecl.cpp
    M clang/test/CodeGenCUDA/launch-bounds.cu
    M clang/test/SemaCUDA/launch_bounds.cu
    A clang/test/SemaCUDA/launch_bounds_sm_90.cu
    M llvm/lib/Target/NVPTX/NVPTXAsmPrinter.cpp
    M llvm/lib/Target/NVPTX/NVPTXUtilities.cpp
    M llvm/lib/Target/NVPTX/NVPTXUtilities.h
    A llvm/test/CodeGen/NVPTX/maxclusterrank.ll

  Log Message:
  -----------
  [NVPTX] Add support for maxclusterrank in launch_bounds (#66496)

Since SM_90 CUDA supports specifying additional argument to the
launch_bounds attribute: maxBlocksPerCluster, to express the maximum
number of CTAs that can be part of the cluster. See:
https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#cluster-dimension-directives-maxclusterrank
and

https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#launch-bounds
for details.




More information about the All-commits mailing list