[PATCH] D137767: [AMDGPU] Make aperture registers 64 bit

Matt Arsenault via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Nov 16 08:51:41 PST 2022


arsenm added a comment.

In D137767#3929898 <https://reviews.llvm.org/D137767#3929898>, @Pierre-vh wrote:

> In D137767#3927784 <https://reviews.llvm.org/D137767#3927784>, @arsenm wrote:
>
>> In D137767#3927154 <https://reviews.llvm.org/D137767#3927154>, @Pierre-vh wrote:
>>
>>> Note: I can't use COPY in D137542 <https://reviews.llvm.org/D137542> because otherwise it'll "simplify" it and we end up with instructions that use the _HI register variants - which shouldn't be there and are just here because TableGen needs them/I get hundreds of crashes without it. It seems to assume there'll be sub1 for every reg in the class.
>>
>> That either means we're missing a class restriction or need a reserve
>
> Ah the class restriction was it! I fixed it. I got confused and was using SREG_64 (includes aperture register) for the COPY dst register, but if I use SGPR (doesn't include it) then copy coalescing won't mess it up

But naturally it should be SReg_64. It should be valid to copy to VCC, and used directly as a SSrc_b64/VSrc_b64 operand. You may want an SReg_64 variant that excludes these



================
Comment at: llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp:567
+  // Reserve the memory aperture registers 32 & 64 bit variants.
+  reserveRegisterTuples(Reserved, AMDGPU::SRC_SHARED_BASE_LO);
   reserveRegisterTuples(Reserved, AMDGPU::SRC_SHARED_BASE);
----------------
The whole point of reserveRegisterTuples is you don't need to handle each individual subregister 


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D137767/new/

https://reviews.llvm.org/D137767



More information about the llvm-commits mailing list