[llvm] [AMDGPU][True16][CodeGen] Implement sgpr folding in true16 (PR #128929)
Christudasan Devadasan via llvm-commits
llvm-commits at lists.llvm.org
Mon Mar 17 11:10:01 PDT 2025
================
@@ -1073,9 +1095,43 @@ void SIFoldOperandsImpl::foldOperand(
UseMI->getOperand(0).getReg().isVirtual() &&
!UseMI->getOperand(1).getSubReg()) {
LLVM_DEBUG(dbgs() << "Folding " << OpToFold << "\n into " << *UseMI);
+ unsigned Size = TII->getOpSize(*UseMI, 1);
Register UseReg = OpToFold.getReg();
UseMI->getOperand(1).setReg(UseReg);
- UseMI->getOperand(1).setSubReg(OpToFold.getSubReg());
+ unsigned SubRegIdx = OpToFold.getSubReg();
+ // Hack to allow 32-bit SGPRs to be folded into True16 instructions
+ // Remove this if 16-bit SGPRs (i.e. SGPR_LO16) are added to the
+ // VS_16RegClass
+ //
+ // Excerpt from AMDGPUGenRegisterInfo.inc
+ // NoSubRegister, //0
+ // hi16, // 1
+ // lo16, // 2
+ // sub0, // 3
+ // ...
+ // sub1, // 11
+ // sub1_hi16, // 12
+ // sub1_lo16, // 13
+ static_assert(AMDGPU::sub1_hi16 == 12, "Subregister layout has changed");
----------------
cdevadas wrote:
Is there no way to avoid the hardcoded value 12 here? These fields are autogenerated and they are bound to change.
https://github.com/llvm/llvm-project/pull/128929
More information about the llvm-commits
mailing list