[PATCH] D109300: [AMDGPU] Make vector superclasses allocatable

Mon Sep 20 14:37:19 PDT 2021

rampitec added a comment.

In D109300#3010586 <https://reviews.llvm.org/D109300#3010586>, @arsenm wrote:

> In D109300#3010576 <https://reviews.llvm.org/D109300#3010576>, @rampitec wrote:
>
>> Do you know how RA will chose registers for an AV operand? V, A, and AV seem to have same AllocationPriority, so what exactly RA will be doing?
>
> You can set an explicit allocation order for a class. I think what you get now is VGPRs are higher priority than AGPRs. The real point is that RA can decide to introduce RA temporary registers to relieve pressure

An explicit order interleaving registers is probably needed at least on gfx908. However, it does not solve all the issues. Imagine you have set it and allocated 3x32 VGPR tuples and 3x32 AGPR tuples as a result. Then RA will start allocating smaller registers and needs 64 more VGPRs. It will end up with 96 + 64 = 160 VGPRs and 96 AGPRs, where it could get away with 128 of each class. This 2x performance difference.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D109300/new/

https://reviews.llvm.org/D109300