[Mlir-commits] [mlir] [mlir][gpu] Add address space modifier to Barrier (PR #110527)
Fabian Mora
llvmlistbot at llvm.org
Mon Sep 30 11:23:43 PDT 2024
fabianmcg wrote:
> I'd expect these to not wait for all threads to reach the same program point, right?
A fence never makes thread sync guarantees, only memory ordering (see https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#memory-fence-functions). Which precisely highlights the issue with this change, those are completely separate concepts.
> Would this definition make `gpu.barrier` correspond to HIP/CUDA's `__syncthreads()`? Since I have a suspicion that that was the original semantics of `gpu.barrier`
Yes, `__syncthreads = gpu.barrier`.
> IE ideally, I'd like to have separate ops for waiting for all threads to reach the execution point (say, `gpu.synchronize`) and for specifying fences (say, `gpu.memfence`). Then I'd imagine for `gpu.barrier` to be defined as doing both (with the maximal memory fence scope).
I'm inclined to keep `barrier` as it is and only add `memfence`. The name `barrier` is usually used by programming models to mean thread barriers, eg. https://www.openmp.org/spec-html/5.0/openmpsu90.html
https://github.com/llvm/llvm-project/pull/110527
More information about the Mlir-commits
mailing list