[PATCH] D118415: AMDGPU: Reserve v32 if we may need to copy between AGPRs on gfx908

Stanislav Mekhanoshin via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Jan 27 16:47:29 PST 2022


rampitec added a comment.

In D118415#3277923 <https://reviews.llvm.org/D118415#3277923>, @arsenm wrote:

> In D118415#3277872 <https://reviews.llvm.org/D118415#3277872>, @rampitec wrote:
>
>> I think it has to be dynamic depending on the requested occupancy. But even then it can drop the occupancy of a kernel if it uses less than 32 registers, which not uncommon. I do not believe we can reserve it that high.
>
> Reserving in the function argument range is a problem. We also treat the requested occupancy as a hint, not something we're forced to follow

Documentation (https://clang.llvm.org/docs/AttributeReference.html#amdgpu-waves-per-eu) is a bit contradictory:

> An error will be given if:
>
> - Specified values violate subtarget specifications;
> - Specified values are not compatible with values provided through other attributes;
> - The AMDGPU target backend is unable to create machine code that can meet the request.

So it is an error. But then:

> This attribute may be attached to a kernel function definition and is an optimization hint.

Anyway, inability to run kernels at maximum occupancy is a show stopper itself.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D118415/new/

https://reviews.llvm.org/D118415



More information about the llvm-commits mailing list