[llvm-branch-commits] [llvm] [AMDGPU][InsertWaitCnts] Track global_wb/inv/wbinv (PR #135340)
Pierre van Houtryve via llvm-branch-commits
llvm-branch-commits at lists.llvm.org
Fri Apr 11 03:04:20 PDT 2025
================
@@ -698,6 +698,16 @@ class SIInsertWaitcnts {
// Return the appropriate VMEM_*_ACCESS type for Inst, which must be a VMEM or
// FLAT instruction.
WaitEventType getVmemWaitEventType(const MachineInstr &Inst) const {
+ switch (Inst.getOpcode()) {
----------------
Pierre-vh wrote:
I'm not fully certain this is correct. I think it is for the WB case, because we must not optimize out the storecnt added by the memory legalizer after the WB, but for the INV there is no wait. Some tests add a wait though, especially before the end of the function.
I'd like this to just not optimize out soft waitcnts after a WB, it doesn't need to insert new waits. I'm not sure how to do that.
Maybe we could get away with not tracking INV intentionally ?
https://github.com/llvm/llvm-project/pull/135340
More information about the llvm-branch-commits
mailing list