[llvm] [AMDGPU] Avoid unneeded waitcounts before spill stores (PR #108303)

Stanislav Mekhanoshin via llvm-commits llvm-commits at lists.llvm.org
Thu Sep 12 12:53:12 PDT 2024


================
@@ -901,7 +901,7 @@ void WaitcntBrackets::updateByEvent(const SIInstrInfo *TII,
     }
   } else /* LGKM_CNT || EXP_CNT || VS_CNT || NUM_INST_CNTS */ {
     // Match the score to the destination registers.
-    for (unsigned I = 0, E = Inst.getNumOperands(); I != E; ++I) {
+    for (unsigned I = 0, E = Inst.getNumExplicitOperands(); I != E; ++I) {
----------------
rampitec wrote:

I have added the test with vcc_lo load and implicit use of vcc, it has the wait as expected.

The reason it is correct because only memory instructions arrive to the modified loop, VALU is processed in a different place. Memory instructions may implicitly use m0 and flat_scr, but these are not loadable.

https://github.com/llvm/llvm-project/pull/108303


More information about the llvm-commits mailing list