[llvm] [AMDGPU] Allocate i1 argument to SGPRs (PR #72461)

Jun Wang via llvm-commits llvm-commits at lists.llvm.org
Wed Jan 17 15:17:08 PST 2024


================
@@ -715,6 +715,14 @@ bool SILowerI1Copies::lowerCopiesToI1() {
       assert(!MI.getOperand(1).getSubReg());
 
       if (!SrcReg.isVirtual() || (!isLaneMaskReg(SrcReg) && !isVreg1(SrcReg))) {
+        if (!SrcReg.isVirtual() &&
+            TII->getRegisterInfo().getRegSizeInBits(SrcReg, *MRI) == 64) {
+          // When calling convention allocates SGPR for i1, for GPUs with
+          // wavefront size 64, i1 return value is put in 64b SGPR.
+          assert(ST->isWave64());
+          continue;
+        }
+
----------------
jwanggit86 wrote:

```
      MRI->setRegClass(DstReg, IsWave32 ? &AMDGPU::SReg_32RegClass
                                        : &AMDGPU::SReg_64RegClass);
      ...      
      if (!SrcReg.isVirtual() || (!isLaneMaskReg(SrcReg) && !isVreg1(SrcReg))) {
        if (!SrcReg.isVirtual() &&
            TII->getRegisterInfo().getRegSizeInBits(SrcReg, *MRI) == 64) {
          // When calling convention allocates SGPR for i1, for GPUs with
          // wavefront size 64, i1 return value is put in 64b SGPR.
          assert(ST->isWave64());
          continue;
        }

        assert(TII->getRegisterInfo().getRegSizeInBits(SrcReg, *MRI) == 32);
        ...
```
Actually before the IF on line 717 there is a call of `setRegClass` which changes the dest reg
of %3 from vreg_1 to sreg_64. So when line 718 is reached, the insn is already updated to 
`"%3: sreg_64 = COPY $sgpr0_sgpr1"`. The lines added by this patch (718-725) is simply to avoid
triggering the assert on line 726 which assumes that the src phys reg is 32b. For wave64 GPUs, with the calling
conv change introduced by this patch, that assumption is no longer valid.

https://github.com/llvm/llvm-project/pull/72461


More information about the llvm-commits mailing list