[llvm] [AMDGPU] Optionally Use GCNRPTrackers during scheduling (PR #93090)
Jeffrey Byrnes via llvm-commits
llvm-commits at lists.llvm.org
Thu Oct 3 15:59:36 PDT 2024
================
@@ -148,17 +154,44 @@ static bool canUsePressureDiffs(const SUnit &SU) {
return true;
}
-static void getRegisterPressures(bool AtTop,
- const RegPressureTracker &RPTracker, SUnit *SU,
- std::vector<unsigned> &Pressure,
- std::vector<unsigned> &MaxPressure) {
+static void getRegisterPressures(
+ bool AtTop, const RegPressureTracker &RPTracker, SUnit *SU,
+ std::vector<unsigned> &Pressure, std::vector<unsigned> &MaxPressure,
+ GCNDownwardRPTracker &DownwardTracker, GCNUpwardRPTracker &UpwardTracker,
+ ScheduleDAGMI *DAG, const SIRegisterInfo *SRI) {
// getDownwardPressure() and getUpwardPressure() make temporary changes to
// the tracker, so we need to pass those function a non-const copy.
RegPressureTracker &TempTracker = const_cast<RegPressureTracker &>(RPTracker);
- if (AtTop)
- TempTracker.getDownwardPressure(SU->getInstr(), Pressure, MaxPressure);
- else
- TempTracker.getUpwardPressure(SU->getInstr(), Pressure, MaxPressure);
+ if (!GCNTrackers) {
+ AtTop
+ ? TempTracker.getDownwardPressure(SU->getInstr(), Pressure, MaxPressure)
+ : TempTracker.getUpwardPressure(SU->getInstr(), Pressure, MaxPressure);
+
+ return;
+ }
+
+ // GCNTrackers
+ Pressure.resize(4, 0);
+ MachineInstr *MI = SU->getInstr();
+ if (AtTop) {
+ GCNDownwardRPTracker TempDownwardTracker(DownwardTracker);
+ TempDownwardTracker.bumpDownwardPressure(MI, SRI);
+ Pressure[AMDGPU::RegisterPressureSets::SReg_32] =
+ TempDownwardTracker.getPressure().getSGPRNum();
+ Pressure[AMDGPU::RegisterPressureSets::VGPR_32] =
+ TempDownwardTracker.getPressure().getArchVGPRNum();
+ Pressure[AMDGPU::RegisterPressureSets::AGPR_32] =
+ TempDownwardTracker.getPressure().getAGPRNum();
+ } else {
+ GCNUpwardRPTracker TempUpwardTracker(UpwardTracker);
----------------
jrbyrnes wrote:
This is coming from a real example which has code like
```
%0:vreg_64 = IMPLICIT_DEF
%1:vgpr_32 = IMPLICIT_DEF, implicit %0.sub0
%2:vgpr_32 = IMPLICIT_DEF, implicit %0.sub1
...
```
After bottomup scheduling the instruction defining `%2`, we'll speculate the RP for the instruction defining `%1`. When speculating using recede, we use `getLiveLaneMask` to calculate the mask of `%0.sub0`. To do so, we iterate over the `subranges()` of `%0` and find both `sub0` and `sub1` to be live at the position. Thus we inaccurately calculate the usemask of `%0.sub0` as 0xF (for both `sub0` and `sub1`). This is because `%0.sub1` is live at the index for the instruction defining `%1`.
Whereas `getSubRegIndexLaneMask` (used by `bumpUpwardPressure`) for the `sub0` subregidx just returns 0xC as the mask.
For this small example, there will be no problem. But with wide regs with more subregs, these inaccuracies will effect scheduling. This doesn't impact RP calculations for whole regions with consistent `LIS` ordering because the use mask inaccuracies don't effect the delta between previous use mask and new use mask.
https://github.com/llvm/llvm-project/pull/93090
More information about the llvm-commits
mailing list