[llvm] [AMDGPU] Save/Restore SCC bit across waterfall loop. (PR #68363)
Jay Foad via llvm-commits
llvm-commits at lists.llvm.org
Mon Oct 16 09:34:15 PDT 2023
================
@@ -1122,9 +1126,16 @@ void SIFixSGPRCopies::fixSCCCopies(MachineFunction &MF) {
continue;
}
if (DstReg == AMDGPU::SCC) {
- unsigned Opcode = IsWave32 ? AMDGPU::S_AND_B32 : AMDGPU::S_AND_B64;
- Register Exec = IsWave32 ? AMDGPU::EXEC_LO : AMDGPU::EXEC;
- Register Tmp = MRI->createVirtualRegister(TRI->getBoolRC());
+ const TargetRegisterClass *SrcRC =
+ TRI->getRegClassForOperandReg(*MRI, MI.getOperand(1));
+ unsigned SrcRegSize = TRI->getRegSizeInBits(*SrcRC);
+ assert((SrcRegSize == 64 || SrcRegSize == 32) &&
+ "Expected SCC src to be 64 or 32 bits");
+ bool IsSrc32Bit = SrcRegSize == 32;
+ unsigned Opcode = IsSrc32Bit ? AMDGPU::S_AND_B32 : AMDGPU::S_AND_B64;
+ Register Exec = IsSrc32Bit ? AMDGPU::EXEC_LO : AMDGPU::EXEC;
----------------
jayfoad wrote:
This part does not make any sense to me. In wave64 mode, but copying a 32-bit source to scc, why would it be correct to AND with exec_lo?
Stepping back a bit, this code seems to be handling two different cases for the source operand:
1. a divergent 1-bit value represented as a wave-wide register with 1 bit per lane
2. a uniform value represented as a scalar register that could be either 32 or 64 bits wide
How can we tell these two cases apart?
https://github.com/llvm/llvm-project/pull/68363
More information about the llvm-commits
mailing list