[PATCH] D97218: [AMDGPU] Set threshold for regbanks reassign pass

Tue Feb 23 12:22:24 PST 2021

rampitec added inline comments.

================
Comment at: llvm/lib/Target/AMDGPU/GCNRegBankReassign.cpp:54
+  cl::desc("Max number of vregs to run the regbanks reassign pass"),
+  cl::init(100000), cl::Hidden);
+
----------------
arsenm wrote:
> This seems like a pretty low threshold
It is. Here is what happens when we have ~150000 vregs:

```
  147.0184 ( 75.8%)   0.0000 (  0.0%)  147.0184 ( 75.5%)  147.1465 ( 75.5%)  GCN RegBank Reassign
  14.0944 (  7.3%)   0.0800 ( 12.7%)  14.1743 (  7.3%)  14.1812 (  7.3%)  Machine Instruction Scheduler
```
And when we have ~100000 the pass is not even visible at the top of -time-passes. So unless there are better ideas we need to limit it.

One idea I have is to use kind of heuristic to account not only for the number of vregs, but for the number of registers allocated. What makes it slow is checkInterference() for every probed register at every conflict. Obviously time will be proportional to the number of overlapping LIs at the point of conflict and that more or less can be approximated by the number of registers, at least in a most "fat" portion of a program. Moreover, more overlapping LIs we have less chances we will be able to find a combination of registers to resolve a conflict.

If there would be a cheap way to estimate register pressure at a given instruction we could skip individual instructions from search, but I am afraid RPT is not a cheap way.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D97218/new/

https://reviews.llvm.org/D97218