[PATCH] D149332: [AMDGPU] Also consider global and scratch instructions when flushing vmcnt counter in loop preheader
Jay Foad via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Apr 27 07:30:58 PDT 2023
foad added inline comments.
================
Comment at: llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp:1724
for (MachineInstr &MI : *MBB) {
- if (SIInstrInfo::isVMEM(MI)) {
+ if (updateVMCntOnly(MI)) {
if (MI.mayLoad())
----------------
bsaleil wrote:
> foad wrote:
> > My only slight concern is whether we should also accept FLAT instructions here? They update vmcnt but not //only// vmcnt. I'm not sure what the answer is.
> I think it is still better to flush vmcnt in this case.
> With a flat load, we would have:
>
> ```v0 = flat_load(...)
> s_waitcnt vmcnt(0)
> loop {
> ...
> s_waitcnt lgkmcnt(0)
> use(v0)
> ...
> store(...)
> ...
> }```
>
> Which is better than having a s_waitcnt vmcnt in the loop. If the store is also a flat store, it may be worth flushing lgkmcnt too, but I don't know if this case is common or not.
> Anyway, we should add tests cases for that. @rochauha, can you extend `waitcnt-vmcnt-loop.mir` with a minimal test case for global and flat instructions ?
> I think it is still better to flush vmcnt in this case.
OK, so maybe we should test `isVMEM || isFLAT` here?
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D149332/new/
https://reviews.llvm.org/D149332
More information about the llvm-commits
mailing list