[PATCH] D105800: [AMDGPU] Tidy SReg/SGPR definitions using template class

Mon Jul 12 07:37:13 PDT 2021

foad added a comment.

Looks like a nice refactoring.

================
Comment at: llvm/lib/Target/AMDGPU/SIInstructions.td:1242-1243
 // 96-bit bitcast
-def : BitConvert <v3i32, v3f32, SGPR_96>;
-def : BitConvert <v3f32, v3i32, SGPR_96>;
+def : BitConvert <v3i32, v3f32, SReg_96>;
+def : BitConvert <v3f32, v3i32, SReg_96>;

----------------
Was SGPR_96 some kind of anomaly among the existing classes?

================
Comment at: llvm/lib/Target/AMDGPU/SIRegisterInfo.td:700
+                     SIRegisterTuples ttmpList = regList,
+                     int copyCost = !mul(!sra(numRegs, 1), 2),
+                     bit hasTTMP = !ne(regList, ttmpList),
----------------
Don't you want to divide by two rounding up, so `!sra(!add(numRegs, 1), 1)`?

================
Comment at: llvm/lib/Target/AMDGPU/SIRegisterInfo.td:727
+
+    // FIXME: ideally would always be isAllocatable = 0,
+    // but that causes all TableGen-generated subclasses to be marked
----------------
Should we be defining TTMP classes for all sizes instead of working around it like this?

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D105800/new/

https://reviews.llvm.org/D105800