[PATCH] D118415: AMDGPU: Reserve v32 if we may need to copy between AGPRs on gfx908

Christudasan Devadasan via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Jan 28 11:07:54 PST 2022


cdevadas added a comment.

In D118415#3278088 <https://reviews.llvm.org/D118415#3278088>, @rampitec wrote:

> In D118415#3278061 <https://reviews.llvm.org/D118415#3278061>, @arsenm wrote:
>
>> In D118415#3277958 <https://reviews.llvm.org/D118415#3277958>, @rampitec wrote:
>>
>>> Anyway, inability to run kernels at maximum occupancy is a show stopper itself.
>>
>> This is practically impossible if you are using mfma instructions anyway
>
> This might be OK, but consider two things: 1) you are not checking agprs are unused (easy to fix) 2) there are some mfma instructions which only need a128. I don't believe there are kernels which fit that budget, but I cannot blidnly deny it too.
>
> I.e. I would prefer to fail compilation rather than reserving v32.

I guess disallowing direct agpr -> agpr copy for gfx908 early from instruction selection would be a better fix. I am anyway working on a patch to avoid direct sgpr -> agpr copies. In that way, we could avoid Scavenger altogether while lowering COPY instructions.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D118415/new/

https://reviews.llvm.org/D118415



More information about the llvm-commits mailing list