[PATCH] D109301: [AMDGPU] Enable copy between VGPR and AGPR classes during regalloc
Christudasan Devadasan via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon Nov 15 03:30:58 PST 2021
cdevadas added a comment.
> You cannot optimize it in pre-emit peephole as it will create new hazards which will not be handled.
I missed your comment earlier, sorry.
Yes, trying to optimize them at late phases would be risky. It should be done no later than Post-RA scheduler.
But I am not sure we can correctly optimize the subreg tuple copies when strict alignment constraints exist.
I guess, after `virtregrewriter` the sub-registers are no longer tied together. Correct me if I'm wrong.
> That is also not what we would want on gfx90a: '$vgpr5_vgpr6_.. ='. I am not sure if spilling code would handle it correctly but this is a misaligned tuple.
This lit test is compiled only for `gfx908` and that would be the reason we see the misaligned tuple.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D109301/new/
https://reviews.llvm.org/D109301
More information about the llvm-commits
mailing list