[PATCH] D109301: [AMDGPU] Enable copy between VGPR and AGPR classes during regalloc

Christudasan Devadasan via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Nov 15 03:30:58 PST 2021


cdevadas added a comment.

> You cannot optimize it in pre-emit peephole as it will create new hazards which will not be handled.

I missed your comment earlier, sorry.
Yes, trying to optimize them at late phases would be risky. It should be done no later than Post-RA scheduler.
But I am not sure we can correctly optimize the subreg tuple copies when strict alignment constraints exist. 
I guess, after `virtregrewriter` the sub-registers are no longer tied together. Correct me if I'm wrong.

> That is also not what we would want on gfx90a: '$vgpr5_vgpr6_.. ='. I am not sure if spilling code would handle it correctly but this is a misaligned tuple.

This lit test is compiled only for `gfx908` and that would be the reason we see the misaligned tuple.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D109301/new/

https://reviews.llvm.org/D109301



More information about the llvm-commits mailing list