[llvm] [AMDGPU][True16][CodeGen]Support V2S copy with True16 flow (PR #118037)
Matt Arsenault via llvm-commits
llvm-commits at lists.llvm.org
Wed Jan 15 02:59:46 PST 2025
================
@@ -1075,10 +1075,25 @@ void SIFixSGPRCopies::lowerVGPR2SGPRCopies(MachineFunction &MF) {
TRI->getRegClassForOperandReg(*MRI, MI->getOperand(1));
size_t SrcSize = TRI->getRegSizeInBits(*SrcRC);
if (SrcSize == 16) {
- // HACK to handle possible 16bit VGPR source
- auto MIB = BuildMI(*MBB, MI, MI->getDebugLoc(),
- TII->get(AMDGPU::V_READFIRSTLANE_B32), DstReg);
- MIB.addReg(SrcReg, 0, AMDGPU::NoSubRegister);
+ assert(MF.getSubtarget<GCNSubtarget>().useRealTrue16Insts() &&
+ "We do not expect to see 16-bit copies from VGPR to SGPR unless "
+ "we have 16-bit VGPRs");
+ assert(MRI->getRegClass(DstReg) == &AMDGPU::SGPR_LO16RegClass ||
+ MRI->getRegClass(DstReg) == &AMDGPU::SReg_32RegClass);
+ // There is no V_READFIRSTLANE_B16, so widen the destination scalar
+ // value to 32 bits
+ MRI->setRegClass(DstReg, &AMDGPU::SGPR_32RegClass);
----------------
arsenm wrote:
Could also just query the register class from V_READFIRSTLANE_B32's operand info
https://github.com/llvm/llvm-project/pull/118037
More information about the llvm-commits
mailing list