[PATCH] D89582: clang/AMDGPU: Apply workgroup related attributes to all functions

Matt Arsenault via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Fri Oct 16 12:35:20 PDT 2020


arsenm added a comment.

In D89582#2335671 <https://reviews.llvm.org/D89582#2335671>, @rampitec wrote:

> In D89582#2335619 <https://reviews.llvm.org/D89582#2335619>, @arsenm wrote:
>
>> In D89582#2335574 <https://reviews.llvm.org/D89582#2335574>, @yaxunl wrote:
>>
>>> What if a device function is called by kernels with different work group sizes, will caller's work group size override callee's work group size?
>>
>> It's user error to call a function with a larger range than the caller
>
> The problem is that user can override default on a kernel with the attribute, but cannot do so on function. So a module can be compiled with a default smaller than requested on one of the kernels.



> Then if default is maximum 1024 and can only be overridden with the --gpu-max-threads-per-block option it would not be problem, if not the description of the option:
>
>   LANGOPT(GPUMaxThreadsPerBlock, 32, 256, "default max threads per block for kernel launch bounds for HIP")
>
> I.e. it says about the "default", so it should be perfectly legal to set a higher limits on a specific kernel. Should the option say it restricts the maximum it would be legal to apply it to functions as well.

The current backend default ends up greatly restricting the registers used in the functions, and increasing the spilling.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D89582/new/

https://reviews.llvm.org/D89582



More information about the cfe-commits mailing list