[PATCH] D105709: [AMDGPU][GlobalISel] Insert an and with exec before s_cbranch_vccnz if necessary
Mirko Brkusanin via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Fri Jul 9 09:22:30 PDT 2021
mbrkusanin created this revision.
mbrkusanin added reviewers: foad, arsenm.
mbrkusanin added a project: LLVM.
Herald added subscribers: kerbowa, hiraditya, t-tye, tpr, dstuttard, rovka, yaxunl, nhaehnle, jvesely, kzhuravl.
mbrkusanin requested review of this revision.
Herald added a subscriber: wdng.
While v_cmp will AND inactive lanes with 0, that is not the case for logical
operations.
This fixes a Vulkan CTS test that would hang otherwise.
Repository:
rG LLVM Github Monorepo
https://reviews.llvm.org/D105709
Files:
llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp
llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-brcond.mir
Index: llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-brcond.mir
===================================================================
--- llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-brcond.mir
+++ llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-brcond.mir
@@ -174,3 +174,40 @@
bb.1:
...
+
+---
+
+name: brcond_vcc_not_cmp
+legalized: true
+regBankSelected: true
+
+body: |
+ ; GCN-LABEL: name: brcond_vcc_not_cmp
+ ; GCN: bb.0:
+ ; GCN: successors: %bb.1(0x80000000)
+ ; GCN: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
+ ; GCN: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1
+ ; GCN: [[COPY2:%[0-9]+]]:vgpr_32 = COPY $vgpr2
+ ; GCN: [[COPY3:%[0-9]+]]:vgpr_32 = COPY $vgpr3
+ ; GCN: [[V_CMP_EQ_U32_e64_:%[0-9]+]]:sreg_64 = V_CMP_EQ_U32_e64 [[COPY]], [[COPY1]], implicit $exec
+ ; GCN: [[V_CMP_EQ_U32_e64_1:%[0-9]+]]:sreg_64 = V_CMP_EQ_U32_e64 [[COPY2]], [[COPY3]], implicit $exec
+ ; GCN: [[S_AND_B64_:%[0-9]+]]:sreg_64 = S_AND_B64 [[V_CMP_EQ_U32_e64_]], [[V_CMP_EQ_U32_e64_1]], implicit-def dead $scc
+ ; GCN: [[S_AND_B64_1:%[0-9]+]]:sreg_64 = S_AND_B64 [[S_AND_B64_]], $exec, implicit-def $scc
+ ; GCN: $vcc = COPY [[S_AND_B64_1]]
+ ; GCN: S_CBRANCH_VCCNZ %bb.1, implicit $vcc
+ ; GCN: bb.1:
+ bb.0:
+ liveins: $vgpr0, $vgpr1, $vgpr2, $vgpr3
+
+ %0:vgpr(s32) = COPY $vgpr0
+ %1:vgpr(s32) = COPY $vgpr1
+ %2:vgpr(s32) = COPY $vgpr2
+ %3:vgpr(s32) = COPY $vgpr3
+ %4:vcc(s1) = G_ICMP intpred(eq), %0, %1
+ %5:vcc(s1) = G_ICMP intpred(eq), %2, %3
+ %6:vcc(s1) = G_AND %4, %5
+ G_BRCOND %6(s1), %bb.1
+
+ bb.1:
+
+...
Index: llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp
===================================================================
--- llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp
+++ llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp
@@ -2488,11 +2488,25 @@
BrOpcode = AMDGPU::S_CBRANCH_SCC1;
ConstrainRC = &AMDGPU::SReg_32RegClass;
} else {
- // FIXME: Do we have to insert an and with exec here, like in SelectionDAG?
- // We sort of know that a VCC producer based on the register bank, that ands
- // inactive lanes with 0. What if there was a logical operation with vcc
- // producers in different blocks/with different exec masks?
// FIXME: Should scc->vcc copies and with exec?
+
+ // If there was an instruction other than V_CMP then we need to insert an
+ // and with exec.
+ const unsigned CondDefOpc = MRI->getUniqueVRegDef(CondReg)->getOpcode();
+ if (CondDefOpc != AMDGPU::G_ICMP && CondDefOpc != AMDGPU::G_FCMP) {
+ const bool Is64 = STI.isWave64();
+ const TargetRegisterClass *RC =
+ Is64 ? &AMDGPU::SReg_64RegClass : &AMDGPU::SReg_32RegClass;
+ const unsigned Opcode = Is64 ? AMDGPU::S_AND_B64 : AMDGPU::S_AND_B32;
+ const Register Exec = Is64 ? AMDGPU::EXEC : AMDGPU::EXEC_LO;
+
+ Register TmpReg = MRI->createVirtualRegister(RC);
+ BuildMI(*BB, &I, DL, TII.get(Opcode), TmpReg)
+ .addReg(CondReg)
+ .addReg(Exec);
+ CondReg = TmpReg;
+ }
+
CondPhysReg = TRI.getVCC();
BrOpcode = AMDGPU::S_CBRANCH_VCCNZ;
ConstrainRC = TRI.getBoolRC();
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D105709.357541.patch
Type: text/x-patch
Size: 3179 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20210709/fca6acf5/attachment.bin>
More information about the llvm-commits
mailing list