[all-commits] [llvm/llvm-project] dfab31: [NVPTX] Add support for maxclusterrank in launch_b...
Jakub Chlanda via All-commits
all-commits at lists.llvm.org
Tue Sep 26 23:51:41 PDT 2023
Branch: refs/heads/main
Home: https://github.com/llvm/llvm-project
Commit: dfab31b41b4988b6dc8129840eba68f0c36c0f13
https://github.com/llvm/llvm-project/commit/dfab31b41b4988b6dc8129840eba68f0c36c0f13
Author: Jakub Chlanda <jakub at codeplay.com>
Date: 2023-09-27 (Wed, 27 Sep 2023)
Changed paths:
M clang/include/clang/Basic/Attr.td
M clang/include/clang/Basic/DiagnosticSemaKinds.td
M clang/include/clang/Sema/Sema.h
M clang/lib/Basic/Targets/NVPTX.h
M clang/lib/CodeGen/Targets/NVPTX.cpp
M clang/lib/Parse/ParseOpenMP.cpp
M clang/lib/Sema/SemaDeclAttr.cpp
M clang/lib/Sema/SemaTemplateInstantiateDecl.cpp
M clang/test/CodeGenCUDA/launch-bounds.cu
M clang/test/SemaCUDA/launch_bounds.cu
A clang/test/SemaCUDA/launch_bounds_sm_90.cu
M llvm/lib/Target/NVPTX/NVPTXAsmPrinter.cpp
M llvm/lib/Target/NVPTX/NVPTXUtilities.cpp
M llvm/lib/Target/NVPTX/NVPTXUtilities.h
A llvm/test/CodeGen/NVPTX/maxclusterrank.ll
Log Message:
-----------
[NVPTX] Add support for maxclusterrank in launch_bounds (#66496)
Since SM_90 CUDA supports specifying additional argument to the
launch_bounds attribute: maxBlocksPerCluster, to express the maximum
number of CTAs that can be part of the cluster. See:
https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#cluster-dimension-directives-maxclusterrank
and
https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#launch-bounds
for details.
More information about the All-commits
mailing list