[PATCH] D119475: [AMDGPU] Add scheduler pass to rematerialize trivial defs

Stanislav Mekhanoshin via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Feb 14 13:37:53 PST 2022


rampitec added inline comments.


================
Comment at: llvm/lib/Target/AMDGPU/GCNSchedStrategy.cpp:799
+  // occupancy. Keep the modified live-in set from second check.
+  if (!ImprovedAfterSinkingLiveThrus) {
+    // Keep a list of newly rematerialized instructions so that we can easily
----------------
vangthao wrote:
> rampitec wrote:
> > vangthao wrote:
> > > rampitec wrote:
> > > > vangthao wrote:
> > > > > rampitec wrote:
> > > > > > I do not think you really need a whole block above dealing with live-through estimations. First it is not guaranteed to help the RP if you sink a live-through. Second it is essentially the same as this block actually performing rematerialization, just less precise. Keep just rematerialization part.
> > > > > If the live-through is increasing RP in this block then by rematerializing don't we decrease RP by the live-throughs that we rematerialized?
> > > > I do not think LICM will hoist anything like that. Then even if so the code below will handle it as well.
> > > The code below only checks for trivially rematerializable defs used in the high RP block. If the def is live-through and used in a low RP block then it will not be checked. This checks for that and sinks those defs to lower RP blocks so we can decrease live-through RP.
> > I see 2 situations where something sinkable can be live-through:
> > 
> > 1)
> > ```
> > Def
> > loop:
> >   ...
> > cbr loop
> > Use
> > ```
> > There is absolutely no reason for any pass to hoist that Def high, most likely it will be sunk much earlier.
> > 
> > 2)
> > ```
> > Def
> > loop1:
> >   loop2:
> >     Use
> >   cbr loop2
> > cbr loop1
> > ```
> > Assuming you have a high live-through pressure in loop1 sinking Def to Use into loop2 will unlikely help it as the highest pressure is likely in the loop2 anyway.
> What about this situation? The def is live throughout the whole loop since it is needed in each iteration.
> 
> ```
> Def
> loop1:
>   ...
>   ... (high rp block)
>   ...
>   use (lower rp block)
>   cbr loop1
> ```
> 
This probably can happen, I doubt it is a common situation though. But then I still do not understand why code below will not handle it? You are only collecting instructions where defined register is in the live-in pressure of the high RP block, so it will be collected. Then below you are sinking it to the use. I.e. to me it will do the same thing automatically.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D119475/new/

https://reviews.llvm.org/D119475



More information about the llvm-commits mailing list