[PATCH] D114643: [AMDGPU] Aggressively fold immediates in SIFoldOperands

Wed May 18 02:40:47 PDT 2022

foad added inline comments.


================
Comment at: llvm/lib/Target/AMDGPU/SIFoldOperands.cpp:1246-1249
+  SmallVector<MachineOperand *, 4> UsesToProcess;
+  for (auto &Use : MRI->use_nodbg_operands(Dst.getReg()))
+    UsesToProcess.push_back(&Use);
+  for (auto U : UsesToProcess) {
----------------
arsenm wrote:
> foad wrote:
> > sebastian-ne wrote:
> > > Can this use `make_early_inc_range` instead of caching the small vector like in the if-case? (if so, this probably makes more sense as an NFC patch afterwards)
> > Not sure. I can try that as a follow-up.
> The iterators in this pass are kind of a mess. I've wanted to rewrite this pass to work more like how PeepholeOpt works, collecting defs, visiting uses and looking for collected defs. 
I tried this, but it seems to get stuck in infinite loops in several lit tests. Not sure why:
```

diff --git a/llvm/lib/Target/AMDGPU/SIFoldOperands.cpp b/llvm/lib/Target/AMDGPU/SIFoldOperands.cpp
index 99aa8a60b04f..3159693a2b6e 100644
--- a/llvm/lib/Target/AMDGPU/SIFoldOperands.cpp
+++ b/llvm/lib/Target/AMDGPU/SIFoldOperands.cpp
@@ -1243,12 +1243,9 @@ bool SIFoldOperands::foldInstOperand(MachineInstr &MI,
     }
   }
 
-  SmallVector<MachineOperand *, 4> UsesToProcess;
-  for (auto &Use : MRI->use_nodbg_operands(Dst.getReg()))
-    UsesToProcess.push_back(&Use);
-  for (auto U : UsesToProcess) {
-    MachineInstr *UseMI = U->getParent();
-    foldOperand(OpToFold, UseMI, UseMI->getOperandNo(U), FoldList,
+  for (auto &U : make_early_inc_range(MRI->use_nodbg_operands(Dst.getReg()))) {
+    MachineInstr *UseMI = U.getParent();
+    foldOperand(OpToFold, UseMI, UseMI->getOperandNo(&U), FoldList,
                 CopiesToReplace);
   }
 
```


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D114643/new/

https://reviews.llvm.org/D114643