[PATCH] D120544: [AMDGPU] Omit unnecessary waitcnt before barriers
Marek Olšák via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Wed May 4 12:12:40 PDT 2022
mareko added a comment.
Herald added a subscriber: jsilvanus.
> I suppose the only difference this makes here is potentially in the counter value we wait for? I.e., we may wait for vmcnt(N) with N != 0?
LLVM IR can track everything accurately, and the compiler can choose what to do with it. The hw isn't that flexible, but it could either wait for nothing if that memory scope is not busy, vmcnt(N) with N > 0, or vmcnt(0).
I've changed the aliasing rules in my comment such that waiting for a memory scope shouldn't wait for any other memory scope even if they alias (the memory barrier is a user's choice).
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D120544/new/
https://reviews.llvm.org/D120544
More information about the llvm-commits
mailing list