[PATCH] D105800: [AMDGPU] Tidy SReg/SGPR definitions using template class

Carl Ritson via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Jul 13 02:26:03 PDT 2021


critson added inline comments.


================
Comment at: llvm/lib/Target/AMDGPU/SIInstructions.td:1242-1243
 // 96-bit bitcast
-def : BitConvert <v3i32, v3f32, SGPR_96>;
-def : BitConvert <v3f32, v3i32, SGPR_96>;
+def : BitConvert <v3i32, v3f32, SReg_96>;
+def : BitConvert <v3f32, v3i32, SReg_96>;
 
----------------
arsenm wrote:
> foad wrote:
> > Was SGPR_96 some kind of anomaly among the existing classes?
> I think the *GPR_* class names are better where applicable
OK, I've put these back. Pretty much everything else in this file uses SReg_*


================
Comment at: llvm/lib/Target/AMDGPU/SIRegisterInfo.td:700
+                     SIRegisterTuples ttmpList = regList,
+                     int copyCost = !mul(!sra(numRegs, 1), 2),
+                     bit hasTTMP = !ne(regList, ttmpList),
----------------
foad wrote:
> Don't you want to divide by two rounding up, so `!sra(!add(numRegs, 1), 1)`?
Thanks.


================
Comment at: llvm/lib/Target/AMDGPU/SIRegisterInfo.td:727
+
+    // FIXME: ideally would always be isAllocatable = 0,
+    // but that causes all TableGen-generated subclasses to be marked
----------------
foad wrote:
> Should we be defining TTMP classes for all sizes instead of working around it like this?
At the moment if I create the TTMP_96Regs set then SGPR_128 stops working correctly.  So I need to investigate further if we want to do that.  Also SGPR_1024 will never have an associated TTMP set because there are only 16 trap registers.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D105800/new/

https://reviews.llvm.org/D105800



More information about the llvm-commits mailing list