[llvm] [AMDGPU][Scheduler] Consistent occupancy calculation during rematerialization (PR #149224)
Lucas Ramirez via llvm-commits
llvm-commits at lists.llvm.org
Sun Aug 3 16:49:05 PDT 2025
================
@@ -412,16 +411,19 @@ bool GCNRPTarget::isSaveBeneficial(Register Reg,
return RP.getSGPRNum() > MaxSGPRs;
unsigned NumVGPRs =
SRI->isAGPRClass(RC) ? RP.getAGPRNum() : RP.getArchVGPRNum();
- return isVGPRBankSaveBeneficial(NumVGPRs);
+ // The addressable limit must always be respected.
+ if (NumVGPRs > MaxVGPRs)
+ return true;
+ // For unified RFs, combined VGPR usage limit must be respected as well.
+ return UnifiedRF && RP.getVGPRNum(true) > MaxUnifiedVGPRs;
----------------
lucas-rami wrote:
> By reducing cross RC pressure any time we're over the MaxUnifiedVGPRs, we are telling the rematerializer to issue cross RC copies to increase occupancy.
Apologies, I am not sure I understand.
I guess we agree on the spilling case ($MaxVGPRs=256 \wedge MaxUnifiedVGPRs=512$) since in that case $NumVGPRsInRC \leq MaxVGPRs \wedge RP.getVGPRNum(true) > MaxUnifiedVGPRs \Longrightarrow NumVGPRsInOtherRC > MaxVGPRs$ (modulo the VGPR allocation granule in the unified computation) i.e., we only do cross-RC saves if there are too many excess VGPRs in the other RC to fit through copies in the current RC.
For the occupancy increase case ($0<MaxVGPRs=MaxUnifiedVGPRs\leq256$) we always have $NumVGPRsInRC<256$ and $NumVGPRsInOtherRC<256$ otherwise the stage would be trying to reduce spilling. If $NumVGPRsInRC \leq MaxVGPRs \wedge RP.getVGPRNum(true) > MaxUnifiedVGPRs$, isn't any VGPR/AGPR save beneficial? Is there a chance we increase the number of cross RC copies by always saving there?
https://github.com/llvm/llvm-project/pull/149224
More information about the llvm-commits
mailing list