[PATCH] D120544: [AMDGPU] Omit unnecessary waitcnt before barriers

Marek Olšák via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue May 3 09:26:45 PDT 2022


mareko added a comment.

The IR (NIR) that we translate to LLVM typically contains this sequence of instructions:

1. memory_barrier (fence)
2. control_barrier (s_barrier)

Either of them is optional, which means a memory barrier can occur without a control barrier, and vice versa. SPIR-V also works like this.

The memory barrier (fence) has the expressive power to wait for one or more of these:

- all shared memory opcodes
- only HS output shared memory opcodes
- all memory opcodes
- only global memory opcodes (not buffer or image)
- only buffer opcodes
- only image opcodes
- only buffer opcodes using shader storage buffer object variables (not other buffers if not aliased)
- only buffer/image opcodes using image variables (image variables can also access buffers, but shouldn't include any other buffers if not aliased)
- fences never wait for opcodes writing into write-only buffers
- fences never wait for opcodes reading from read-only buffers and images

What the LLVM fence can do:

- everything or nothing


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D120544/new/

https://reviews.llvm.org/D120544



More information about the llvm-commits mailing list