[PATCH] D71221: [HIP] Add option --gpu-max-threads-per-block=n

Artem Belevich via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Thu Dec 19 14:11:32 PST 2019


tra added a comment.

What's the use case for this flag?



================
Comment at: clang/lib/CodeGen/TargetInfo.cpp:8067
+    unsigned MaxThreadsPerBlock =
+        IsHIPKernel ? M.getLangOpts().GPUMaxThreadsPerBlock : 256;
+    std::string AttrVal = std::string("1,") + llvm::utostr(MaxThreadsPerBlock);
----------------
The magic value of 256 should be defined as a constant or macro somewhere -- you're using it in multiple places.
Alternatively, always set LangOpts.GPUMaxThreadsPerBlock to something and skip figuring out the default everywhere else.


================
Comment at: clang/test/CodeGenCUDA/amdgpu-kernel-attrs.cu:19
+
 __attribute__((amdgpu_flat_work_group_size(32, 64))) // expected-no-diagnostics
 __global__ void flat_work_group_size_32_64() {
----------------
Is this the attribute that `__launch_bounds__()` expands to in HIP?
If __launch_bounds__ is a separate attribute, then, I guess, it should be tested, too.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D71221/new/

https://reviews.llvm.org/D71221





More information about the cfe-commits mailing list