[PATCH] D117562: [AMDGPU] Sink immediate VGPR defs if high RP

Wed Jan 26 02:51:10 PST 2022

foad added a comment.

> Adding @foad, we need more eyes for this huge patch.

There are lots of moving parts here. In general MachineLICM seems bad at tracking register pressure. (For AMD people see SC1-2887.) In particular this comment is pertinent:

  // Besides removing computation from the loop, hoisting an instruction has
  // these effects:
  //
  // - The value defined by the instruction becomes live across the entire
  //   loop. This increases register pressure in the loop.

but as far as I can see, the code does nothing to account for the extra pressure throughout the whole loop.

One idea we have discussed before is to simply disable MachineLICM for any loop where the register pressure is already "too high" (which would have to be decided by some target hook).

As for RA not sinking defs back into the loop, I guess this is because RA does not know about occupancy? So here's a radical idea: after (pre-RA) scheduling, when we know what occupancy we //want// to achieve, why not make that occupancy a hard requirement, marking all other vgprs reserved? Then RA would have to stick to the budget, spilling if necessary. And yes spilling can be really bad, but just failing to meet your occupancy target can also be really bad.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D117562/new/

https://reviews.llvm.org/D117562