[Mlir-commits] [mlir] [mlir][gpu] Add field to mark asynchronous side effects (PR #72013)
Guray Ozen
llvmlistbot at llvm.org
Wed Nov 15 11:58:18 PST 2023
grypp wrote:
The idea needs a RFC. In my opinion, the idea is worth exploring. FWIW, asynchrony is beyond `nvgpu.device_async_copy`. For instant we have `nvgpu.tma.async.load` and `nvgpu.warpgroup.mma`.
What about shared memory?
Shared memory is a write-back cache. When a thread writes ( like below `myx = 123`) that may not be immediately visible to other threads. Using `gpu.barrier` is quite heavy-handed. In this case, you want to use `__threadfence` to make the write visible to other threads without synchronizing them.
```
__shared__ int myx;
if(threadIdx.x) myx = 123;
...
...
// myx might not visible to other threads?
printf("%d", myx); // need __threadfence() before printf
```
https://github.com/llvm/llvm-project/pull/72013
More information about the Mlir-commits
mailing list