[Mlir-commits] [mlir] [mlir][gpu] Add field to mark asynchronous side effects (PR #72013)

Guray Ozen llvmlistbot at llvm.org
Wed Nov 15 11:58:18 PST 2023


grypp wrote:

The idea needs a RFC. In my opinion, the idea is worth exploring. FWIW, asynchrony is beyond `nvgpu.device_async_copy`. For instant we have `nvgpu.tma.async.load` and `nvgpu.warpgroup.mma`. 

What about shared memory?
Shared memory is a write-back cache. When a thread writes ( like below `myx = 123`) that may not be immediately visible to other threads. Using `gpu.barrier` is quite heavy-handed. In this case, you want to use `__threadfence` to make the write visible to other threads without synchronizing them.

```
__shared__ int myx;
if(threadIdx.x) myx = 123;
...
... 
// myx might not visible to other threads?
printf("%d", myx); // need __threadfence() before printf
```

https://github.com/llvm/llvm-project/pull/72013


More information about the Mlir-commits mailing list