[all-commits] [llvm/llvm-project] ac7253: [Driver] Add `-f[no-]offload-uniform-block`
Yaxun (Sam) Liu via All-commits
all-commits at lists.llvm.org
Thu Jul 27 13:36:36 PDT 2023
Branch: refs/heads/main
Home: https://github.com/llvm/llvm-project
Commit: ac725310433aea6a7c808b11dab6f1a7d4ecf78e
https://github.com/llvm/llvm-project/commit/ac725310433aea6a7c808b11dab6f1a7d4ecf78e
Author: Yaxun (Sam) Liu <yaxun.liu at amd.com>
Date: 2023-07-27 (Thu, 27 Jul 2023)
Changed paths:
M clang/include/clang/Basic/CodeGenOptions.def
M clang/include/clang/Basic/LangOptions.def
M clang/include/clang/Driver/Options.td
M clang/lib/CodeGen/CGCall.cpp
M clang/lib/CodeGen/Targets/AMDGPU.cpp
M clang/lib/Driver/ToolChains/Clang.cpp
M clang/test/CodeGenCUDA/amdgpu-kernel-attrs.cu
M clang/test/CodeGenOpenCL/cl-uniform-wg-size.cl
M clang/test/Driver/hip-options.hip
M clang/test/Driver/opencl.cl
Log Message:
-----------
[Driver] Add `-f[no-]offload-uniform-block`
By default, clang assumes HIP kernels are launched with uniform block size,
which is the case for kernels launched through triple chevron or
hipLaunchKernelGGL. Clang adds uniform-work-group-size function attribute
to HIP kernels to allow the backend to do optimizations on that.
However, in some rare cases, HIP kernels can be launched
through hipExtModuleLaunchKernel where global work size is specified,
which may result in non-uniform block size.
To be able to support non-uniform block size for HIP kernels,
an option `-f[no-]offload-uniform-block is added. This option
is generic for offloading languages. Its default value is on for
CUDA/HIP and off otherwise.
Make -cl-uniform-work-group-size an alias to -foffload-uniform-block.
Reviewed by: Siu Chi Chan, Matt Arsenault, Fangrui Song, Johannes Doerfert
Differential Revision: https://reviews.llvm.org/D155213
Fixes: SWDEV-406592
More information about the All-commits
mailing list