[PATCH] D109300: [AMDGPU] Make vector superclasses allocatable
Stanislav Mekhanoshin via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon Sep 20 14:37:19 PDT 2021
rampitec added a comment.
In D109300#3010586 <https://reviews.llvm.org/D109300#3010586>, @arsenm wrote:
> In D109300#3010576 <https://reviews.llvm.org/D109300#3010576>, @rampitec wrote:
>
>> Do you know how RA will chose registers for an AV operand? V, A, and AV seem to have same AllocationPriority, so what exactly RA will be doing?
>
> You can set an explicit allocation order for a class. I think what you get now is VGPRs are higher priority than AGPRs. The real point is that RA can decide to introduce RA temporary registers to relieve pressure
An explicit order interleaving registers is probably needed at least on gfx908. However, it does not solve all the issues. Imagine you have set it and allocated 3x32 VGPR tuples and 3x32 AGPR tuples as a result. Then RA will start allocating smaller registers and needs 64 more VGPRs. It will end up with 96 + 64 = 160 VGPRs and 96 AGPRs, where it could get away with 128 of each class. This 2x performance difference.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D109300/new/
https://reviews.llvm.org/D109300
More information about the llvm-commits
mailing list