[PATCH] D120544: [AMDGPU] Omit unnecessary waitcnt before barriers

Marek Olšák via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed May 4 12:12:40 PDT 2022


mareko added a comment.
Herald added a subscriber: jsilvanus.

> I suppose the only difference this makes here is potentially in the counter value we wait for? I.e., we may wait for vmcnt(N) with N != 0?

LLVM IR can track everything accurately, and the compiler can choose what to do with it. The hw isn't that flexible, but it could either wait for nothing if that memory scope is not busy, vmcnt(N) with N > 0, or vmcnt(0).

I've changed the aliasing rules in my comment such that waiting for a memory scope shouldn't wait for any other memory scope even if they alias (the memory barrier is a user's choice).


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D120544/new/

https://reviews.llvm.org/D120544



More information about the llvm-commits mailing list