[PATCH] D149332: [AMDGPU] Also consider global and scratch instructions when flushing vmcnt counter in loop preheader
Baptiste Saleil via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Apr 27 07:24:10 PDT 2023
bsaleil added inline comments.
================
Comment at: llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp:1724
for (MachineInstr &MI : *MBB) {
- if (SIInstrInfo::isVMEM(MI)) {
+ if (updateVMCntOnly(MI)) {
if (MI.mayLoad())
----------------
foad wrote:
> My only slight concern is whether we should also accept FLAT instructions here? They update vmcnt but not //only// vmcnt. I'm not sure what the answer is.
I think it is still better to flush vmcnt in this case.
With a flat load, we would have:
```v0 = flat_load(...)
s_waitcnt vmcnt(0)
loop {
...
s_waitcnt lgkmcnt(0)
use(v0)
...
store(...)
...
}```
Which is better than having a s_waitcnt vmcnt in the loop. If the store is also a flat store, it may be worth flushing lgkmcnt too, but I don't know if this case is common or not.
Anyway, we should add tests cases for that. @rochauha, can you extend `waitcnt-vmcnt-loop.mir` with a minimal test case for global and flat instructions ?
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D149332/new/
https://reviews.llvm.org/D149332
More information about the llvm-commits
mailing list