[PATCH] D67662: [AMDGPU] SIFoldOperands should not fold register acrocc the EXEC definition

Alexander via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Sep 17 08:32:53 PDT 2019


alex-t created this revision.
alex-t added reviewers: rampitec, vpykhtin.
Herald added subscribers: llvm-commits, hiraditya, t-tye, tpr, dstuttard, yaxunl, nhaehnle, wdng, jvesely, kzhuravl, arsenm.
Herald added a project: LLVM.

Register that is defined by the copy that implicitly uses EXEC should not be folded in case there exist EXEC definition between register definition and instruction to which it is going to be folded. Otherwise, scalar values may be exposed outside the divergent loop w/o copying to VGPR.

For instance:

r1 = copy  r0 imp use exec
exec = def_exec

some_inst use(r1)

we cannot fold r0 to some_inst


https://reviews.llvm.org/D67662

Files:
  llvm/lib/Target/AMDGPU/SIFoldOperands.cpp


Index: llvm/lib/Target/AMDGPU/SIFoldOperands.cpp
===================================================================
--- llvm/lib/Target/AMDGPU/SIFoldOperands.cpp
+++ llvm/lib/Target/AMDGPU/SIFoldOperands.cpp
@@ -417,7 +417,23 @@
                             const MachineInstr &MI,
                             const MachineOperand &UseMO) {
   return !UseMO.isUndef() && !TII->isSDWA(MI);
-  //return !MI.hasRegisterImplicitUseOperand(UseMO.getReg());
+}
+
+static bool isFoldingRegisterAcrossExecDef(const MachineInstr &MI,
+                                          const MachineOperand &UseMO) {
+  const MachineInstr *Def = UseMO.getParent();
+  const MachineBasicBlock *MBB = MI.getParent();
+  const MachineBasicBlock *DefBB = Def->getParent();
+  bool CrossDefExec = false;
+  if (!const_cast<MachineBasicBlock *>(DefBB)->canFallThrough() &&
+      (DefBB != MBB)) {
+    MachineBasicBlock::const_iterator IT =
+        Def->getParent()->getFirstTerminator();
+    const MachineInstr *Term = &*IT;
+    CrossDefExec = Def->hasRegisterImplicitUseOperand(AMDGPU::EXEC) &&
+                   Term->definesRegister(AMDGPU::EXEC);
+  }
+  return CrossDefExec;
 }
 
 static bool tryToFoldACImm(const SIInstrInfo *TII,
@@ -1086,6 +1102,9 @@
     Copy->addImplicitDefUseOperands(*MF);
 
   for (FoldCandidate &Fold : FoldList) {
+    if (Fold.isReg() &&
+        isFoldingRegisterAcrossExecDef(*Fold.UseMI, *Fold.OpToFold))
+      continue;
     if (updateOperand(Fold, *TII, *TRI, *ST)) {
       // Clear kill flags.
       if (Fold.isReg()) {


-------------- next part --------------
A non-text attachment was scrubbed...
Name: D67662.220510.patch
Type: text/x-patch
Size: 1547 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20190917/cec011f0/attachment.bin>


More information about the llvm-commits mailing list