[llvm] [AMDGPU] Save/Restore SCC bit across waterfall loop. (PR #68363)

Jay Foad via llvm-commits llvm-commits at lists.llvm.org
Mon Oct 16 09:34:15 PDT 2023


================
@@ -1122,9 +1126,16 @@ void SIFixSGPRCopies::fixSCCCopies(MachineFunction &MF) {
         continue;
       }
       if (DstReg == AMDGPU::SCC) {
-        unsigned Opcode = IsWave32 ? AMDGPU::S_AND_B32 : AMDGPU::S_AND_B64;
-        Register Exec = IsWave32 ? AMDGPU::EXEC_LO : AMDGPU::EXEC;
-        Register Tmp = MRI->createVirtualRegister(TRI->getBoolRC());
+        const TargetRegisterClass *SrcRC =
+            TRI->getRegClassForOperandReg(*MRI, MI.getOperand(1));
+        unsigned SrcRegSize = TRI->getRegSizeInBits(*SrcRC);
+        assert((SrcRegSize == 64 || SrcRegSize == 32) &&
+               "Expected SCC src to be 64 or 32 bits");
+        bool IsSrc32Bit = SrcRegSize == 32;
+        unsigned Opcode = IsSrc32Bit ? AMDGPU::S_AND_B32 : AMDGPU::S_AND_B64;
+        Register Exec = IsSrc32Bit ? AMDGPU::EXEC_LO : AMDGPU::EXEC;
----------------
jayfoad wrote:

This part does not make any sense to me. In wave64 mode, but copying a 32-bit source to scc, why would it be correct to AND with exec_lo?

Stepping back a bit, this code seems to be handling two different cases for the source operand:
1. a divergent 1-bit value represented as a wave-wide register with 1 bit per lane
2. a uniform value represented as a scalar register that could be either 32 or 64 bits wide

How can we tell these two cases apart?

https://github.com/llvm/llvm-project/pull/68363


More information about the llvm-commits mailing list