[PATCH] D71221: [HIP] Add option --gpu-max-threads-per-block=n
Artem Belevich via Phabricator via cfe-commits
cfe-commits at lists.llvm.org
Thu Dec 19 14:11:32 PST 2019
tra added a comment.
What's the use case for this flag?
================
Comment at: clang/lib/CodeGen/TargetInfo.cpp:8067
+ unsigned MaxThreadsPerBlock =
+ IsHIPKernel ? M.getLangOpts().GPUMaxThreadsPerBlock : 256;
+ std::string AttrVal = std::string("1,") + llvm::utostr(MaxThreadsPerBlock);
----------------
The magic value of 256 should be defined as a constant or macro somewhere -- you're using it in multiple places.
Alternatively, always set LangOpts.GPUMaxThreadsPerBlock to something and skip figuring out the default everywhere else.
================
Comment at: clang/test/CodeGenCUDA/amdgpu-kernel-attrs.cu:19
+
__attribute__((amdgpu_flat_work_group_size(32, 64))) // expected-no-diagnostics
__global__ void flat_work_group_size_32_64() {
----------------
Is this the attribute that `__launch_bounds__()` expands to in HIP?
If __launch_bounds__ is a separate attribute, then, I guess, it should be tested, too.
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D71221/new/
https://reviews.llvm.org/D71221
More information about the cfe-commits
mailing list