[PATCH] D137767: [AMDGPU] Make aperture registers 64 bit
Matt Arsenault via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Wed Nov 16 08:51:41 PST 2022
arsenm added a comment.
In D137767#3929898 <https://reviews.llvm.org/D137767#3929898>, @Pierre-vh wrote:
> In D137767#3927784 <https://reviews.llvm.org/D137767#3927784>, @arsenm wrote:
>
>> In D137767#3927154 <https://reviews.llvm.org/D137767#3927154>, @Pierre-vh wrote:
>>
>>> Note: I can't use COPY in D137542 <https://reviews.llvm.org/D137542> because otherwise it'll "simplify" it and we end up with instructions that use the _HI register variants - which shouldn't be there and are just here because TableGen needs them/I get hundreds of crashes without it. It seems to assume there'll be sub1 for every reg in the class.
>>
>> That either means we're missing a class restriction or need a reserve
>
> Ah the class restriction was it! I fixed it. I got confused and was using SREG_64 (includes aperture register) for the COPY dst register, but if I use SGPR (doesn't include it) then copy coalescing won't mess it up
But naturally it should be SReg_64. It should be valid to copy to VCC, and used directly as a SSrc_b64/VSrc_b64 operand. You may want an SReg_64 variant that excludes these
================
Comment at: llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp:567
+ // Reserve the memory aperture registers 32 & 64 bit variants.
+ reserveRegisterTuples(Reserved, AMDGPU::SRC_SHARED_BASE_LO);
reserveRegisterTuples(Reserved, AMDGPU::SRC_SHARED_BASE);
----------------
The whole point of reserveRegisterTuples is you don't need to handle each individual subregister
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D137767/new/
https://reviews.llvm.org/D137767
More information about the llvm-commits
mailing list