[PATCH] D109301: [AMDGPU] Enable copy between VGPR and AGPR classes during regalloc

Stanislav Mekhanoshin via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Nov 15 10:56:24 PST 2021


rampitec added a comment.

In D109301#3130948 <https://reviews.llvm.org/D109301#3130948>, @cdevadas wrote:

>> You cannot optimize it in pre-emit peephole as it will create new hazards which will not be handled.
>
> I missed your comment earlier, sorry.
> Yes, trying to optimize them at late phases would be risky. It should be done no later than Post-RA scheduler.
> But I am not sure we can correctly optimize the subreg tuple copies when strict alignment constraints exist. 
> I guess, after `virtregrewriter` the sub-registers are no longer tied together. Correct me if I'm wrong.

In fact restoring into av superclass also seems problematic. I believe we have agreed all the code here only work correctly if we have no actual av registers past selection.

>> That is also not what we would want on gfx90a: '$vgpr5_vgpr6_.. ='. I am not sure if spilling code would handle it correctly but this is a misaligned tuple.
>
> This lit test is compiled only for `gfx908` and that would be the reason we see the misaligned tuple.

OK, on gfx908 this is legal. Then something like this shall never happen on gfx90a. I hope it does not.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D109301/new/

https://reviews.llvm.org/D109301



More information about the llvm-commits mailing list