[PATCH] D105800: [AMDGPU] Tidy SReg/SGPR definitions using template class
Carl Ritson via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Jul 13 02:26:03 PDT 2021
critson added inline comments.
================
Comment at: llvm/lib/Target/AMDGPU/SIInstructions.td:1242-1243
// 96-bit bitcast
-def : BitConvert <v3i32, v3f32, SGPR_96>;
-def : BitConvert <v3f32, v3i32, SGPR_96>;
+def : BitConvert <v3i32, v3f32, SReg_96>;
+def : BitConvert <v3f32, v3i32, SReg_96>;
----------------
arsenm wrote:
> foad wrote:
> > Was SGPR_96 some kind of anomaly among the existing classes?
> I think the *GPR_* class names are better where applicable
OK, I've put these back. Pretty much everything else in this file uses SReg_*
================
Comment at: llvm/lib/Target/AMDGPU/SIRegisterInfo.td:700
+ SIRegisterTuples ttmpList = regList,
+ int copyCost = !mul(!sra(numRegs, 1), 2),
+ bit hasTTMP = !ne(regList, ttmpList),
----------------
foad wrote:
> Don't you want to divide by two rounding up, so `!sra(!add(numRegs, 1), 1)`?
Thanks.
================
Comment at: llvm/lib/Target/AMDGPU/SIRegisterInfo.td:727
+
+ // FIXME: ideally would always be isAllocatable = 0,
+ // but that causes all TableGen-generated subclasses to be marked
----------------
foad wrote:
> Should we be defining TTMP classes for all sizes instead of working around it like this?
At the moment if I create the TTMP_96Regs set then SGPR_128 stops working correctly. So I need to investigate further if we want to do that. Also SGPR_1024 will never have an associated TTMP set because there are only 16 trap registers.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D105800/new/
https://reviews.llvm.org/D105800
More information about the llvm-commits
mailing list