[PATCH] D59295: [AMDGPU] Pre-allocate WWM registers to reduce VGPR pressure.

Wed Mar 13 12:02:11 PDT 2019

cwabbott added a comment.

Actually, now that I think about it, I believe we realized that SIFixWWMLiveness has a giant hole in that if any of the extra live ranges we insert are split, it'll fall over. I don't think anyone has come up with a way to express the constraints only with extra defs and uses in a way that always works, and I'm not sure it's possible. The issue is that we're lying to LLVM RA by pretending that vector instructions always fully clobber their destinations, and while before we were careful to never write to any inactive channels in order to keep up the charade, but WWM instructions force us to deal with it somehow. Fully informing LLVM of what's going on would involve marking every vector instruction as partially clobbering its destination, even the move instructions and load/store instructions LLVM emits during RA, which of course would tank performance unless LLVM is taught about predicated liveness -- but of course that's a whole lot of work that opens another can of worms (register pressure is suddenly not that meaningful anymore...).

In D59295#1427874 <https://reviews.llvm.org/D59295#1427874>, @cwabbott wrote:

> The best thing we could come up with that does this was the current SIFixWWMLiveness pass, and even with subsequent modifications, it's still pretty terrible in practice.

Repository:
  rL LLVM

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D59295/new/

https://reviews.llvm.org/D59295