[llvm] [AMDGPU] Allocate i1 argument to SGPRs (PR #72461)
Jun Wang via llvm-commits
llvm-commits at lists.llvm.org
Mon May 20 14:11:02 PDT 2024
================
@@ -121,6 +127,15 @@ struct AMDGPUIncomingArgHandler : public CallLowering::IncomingValueHandler {
const CCValAssign &VA) override {
markPhysRegUsed(PhysReg);
+ if (VA.getLocVT() == MVT::i1) {
+ MIRBuilder.buildCopy(ValVReg, PhysReg);
+ MRI.setRegClass(ValVReg, MIRBuilder.getMF()
+ .getSubtarget<GCNSubtarget>()
+ .getRegisterInfo()
+ ->getBoolRC());
----------------
jwanggit86 wrote:
Case2: i1 arg used in a branch:
After IRTranslator:
```
%0:_(s1) = COPY $sgpr4_sgpr5
%1:_(s1) = G_CONSTANT i1 true
%5:_(s32) = G_CONSTANT i32 0
%6:_(p1) = G_GLOBAL_VALUE @static.gv2
%9:_(p1) = G_GLOBAL_VALUE @static.gv0
%2:_(s1) = G_XOR %0:_, %1:_
%3:_(s1), %4:_(s64) = G_INTRINSIC_CONVERGENT_W_SIDE_EFFECTS
intrinsic(@llvm.amdgcn.if), %2:_(s1)
G_BRCOND %3:_(s1), %bb.4
```
After RegBankSelect:
```
%0:sgpr(s1) = COPY $sgpr4_sgpr5
%20:sgpr(s32) = G_CONSTANT i32 1
%1:sgpr(s1) = G_TRUNC %20:sgpr(s32)
%21:vcc(s1) = COPY %0:sgpr(s1)
%22:vcc(s1) = COPY %1:sgpr(s1)
%2:sreg_64_xexec(s1) = G_XOR %21:vcc, %22:vcc
%4:sreg_64_xexec(s64) = SI_IF %2:sreg_64_xexec(s1), %bb.2,
implicit-def $exec, implicit-def $scc, implicit $exec
```
After InstructionSelect:
```
%0:sreg_32 = COPY $sgpr4_sgpr5 // will cause "illegal copy"
%29:sreg_32 = S_AND_B32 1, %0:sreg_32, implicit-def
dead $scc
%21:sreg_64_xexec = V_CMP_NE_U32_e64 0, %29:sreg_32,
implicit $exec
%22:sreg_64_xexec = S_MOV_B64 -1
%2:sreg_64_xexec = S_XOR_B64 %21:sreg_64_xexec,
%22:sreg_64_xexec, implicit-def dead $scc
%4:sreg_64_xexec = SI_IF %2:sreg_64_xexec, %bb.2,
implicit-def $exec, implicit-def $scc, implicit $exec
S_BRANCH %bb.4
```
https://github.com/llvm/llvm-project/pull/72461
More information about the llvm-commits
mailing list