[llvm] AMDGPU/GlobalISel: Fix inst-selection of ballot (PR #109986)
Nicolai Hähnle via llvm-commits
llvm-commits at lists.llvm.org
Mon Oct 14 02:32:39 PDT 2024
================
@@ -1413,50 +1413,101 @@ bool AMDGPUInstructionSelector::selectIntrinsicCmp(MachineInstr &I) const {
return true;
}
+// Ballot has to zero bits in input lane-mask that are zero in current exec,
+// Done as AND with exec. For inputs that are results of instruction that
+// implicitly use same exec, for example compares in same basic block or SCC to
+// VCC copy, use copy.
+static bool isLaneMaskFromSameBlock(Register Reg, MachineRegisterInfo &MRI,
+ MachineBasicBlock *MBB) {
+ MachineInstr *MI = MRI.getVRegDef(Reg);
+ if (MI->getParent() != MBB)
+ return false;
+
+ // Lane mask generated by SCC to VCC copy.
+ if (MI->getOpcode() == AMDGPU::COPY) {
+ auto DstRB = MRI.getRegBankOrNull(MI->getOperand(0).getReg());
+ auto SrcRB = MRI.getRegBankOrNull(MI->getOperand(1).getReg());
+ if (DstRB && SrcRB && DstRB->getID() == AMDGPU::VCCRegBankID &&
+ SrcRB->getID() == AMDGPU::SGPRRegBankID)
+ return true;
+ }
+
+ // Lane mask generated using compare with same exec.
+ if (isa<GAnyCmp>(MI))
----------------
nhaehnle wrote:
This is now missing a check that MI is in MBB.
https://github.com/llvm/llvm-project/pull/109986
More information about the llvm-commits
mailing list