[PATCH] D123525: [AMDGPU] On gfx908, reserve VGPR for AGPR copy based on register budget.
Stanislav Mekhanoshin via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Apr 12 10:04:20 PDT 2022
rampitec added a comment.
Let's begin with the question: what problem are you trying to solve?
================
Comment at: llvm/lib/Target/AMDGPU/SIInstrInfo.cpp:628
// or spill with the slot.
while (RegNo-- && RS.FindUnusedReg(&AMDGPU::VGPR_32RegClass)) {
Register Tmp2 = RS.scavengeRegister(&AMDGPU::VGPR_32RegClass, 0);
----------------
hsmhsm wrote:
> rampitec wrote:
> > It will never get replaced if RegNo == 0.
> Yes, I noticed it. I think, we need to take a relook at it.
This completely breaks the original logic and idea. This loop exists because there are 2 waitstates needed to reuse a VGPR used in the v_accvgpr_write on gfx908. When you are copying a wide register, say a 16 dword tuple, we would have to insert 30 nops is we are using a single register. Therefore this code attempts to use another register each time until we have copied 3 dwords. Hence the 'DestReg % 3'. This code just does not make sense anymore.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D123525/new/
https://reviews.llvm.org/D123525
More information about the llvm-commits
mailing list