[PATCH] D120544: [AMDGPU] Omit unnecessary waitcnt before barriers
Marek Olšák via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue May 3 09:26:45 PDT 2022
mareko added a comment.
The IR (NIR) that we translate to LLVM typically contains this sequence of instructions:
1. memory_barrier (fence)
2. control_barrier (s_barrier)
Either of them is optional, which means a memory barrier can occur without a control barrier, and vice versa. SPIR-V also works like this.
The memory barrier (fence) has the expressive power to wait for one or more of these:
- all shared memory opcodes
- only HS output shared memory opcodes
- all memory opcodes
- only global memory opcodes (not buffer or image)
- only buffer opcodes
- only image opcodes
- only buffer opcodes using shader storage buffer object variables (not other buffers if not aliased)
- only buffer/image opcodes using image variables (image variables can also access buffers, but shouldn't include any other buffers if not aliased)
- fences never wait for opcodes writing into write-only buffers
- fences never wait for opcodes reading from read-only buffers and images
What the LLVM fence can do:
- everything or nothing
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D120544/new/
https://reviews.llvm.org/D120544
More information about the llvm-commits
mailing list