[llvm] AMDGPU/GlobalISel: Fix inst-selection of ballot (PR #109986)

Matt Arsenault via llvm-commits llvm-commits at lists.llvm.org
Thu Oct 10 05:11:04 PDT 2024


================
@@ -1413,50 +1413,97 @@ bool AMDGPUInstructionSelector::selectIntrinsicCmp(MachineInstr &I) const {
   return true;
 }
 
+// Ballot has to zero bits in input lane-mask that are zero in current exec,
+// Done as AND with exec. For inputs that are results of instruction that
+// implicitly use same exec, for example compares in same basic block, use copy.
+bool isBallotCopy(Register Reg, MachineRegisterInfo &MRI,
+                  MachineBasicBlock *MBB) {
+  MachineInstr *MI = MRI.getVRegDef(Reg);
+  // Look through copies, truncs and anyext. TODO: just copies
----------------
arsenm wrote:

That would be at most one copy? 

Also thinking about whether such a trivially uniform ballot should have folded out already 

https://github.com/llvm/llvm-project/pull/109986


More information about the llvm-commits mailing list