[PATCH] D123622: [AMDGPU] Allow sinking defs with multiple uses in PreRARemterialize scheduling stage

Tue Apr 12 10:29:07 PDT 2022

vangthao added a comment.

In D123622#3446171 <https://reviews.llvm.org/D123622#3446171>, @rampitec wrote:

> In D123622#3446149 <https://reviews.llvm.org/D123622#3446149>, @vangthao wrote:
>
>> In D123622#3446136 <https://reviews.llvm.org/D123622#3446136>, @rampitec wrote:
>>
>>> Is there a real usecase? I do not like scheduler going that way.
>>
>> This fixes the regression in SWDEV-316487. I agree that this is making the scheduler too complex. We really need to a way to calculate register pressure before hoisting trivially rematerializable defs in MachineLICM or make this its own pass.
>
> Is that still a problem? Wasn't it fixed by the first commit?

It is still an issue. We are not able to collect enough trivially rematerializable defs with just single def/single use instructions. Multiple defs are hoisted and then eliminated due to being redundant thus increasing their use count. In another case, MachineLICM hoisted parts of a reg sequence and we are unable to sink them back down due being part of a subreg. This causes an increase in overall register pressure throughout the loop and decreases occupancy.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D123622/new/

https://reviews.llvm.org/D123622