[Openmp-commits] [PATCH] D62393: [OPENMP][NVPTX]Mark parallel level counter as volatile.

Alexey Bataev via Phabricator via Openmp-commits openmp-commits at lists.llvm.org
Mon Jun 3 18:04:29 PDT 2019

ABataev added a comment.

In D62393#1528421 <https://reviews.llvm.org/D62393#1528421>, @jdoerfert wrote:

> I still do not see why volatile is the right solution here:
>  In D61395 <https://reviews.llvm.org/D61395> the way the level was tracked was changed but I failed to see how exactly. We have to start there.
> - What is the constant `OMP_ACTIVE_PARALLEL_LEVEL` doing exactly?

It marks if the parallel region has more than 1 thread. Only the very 1 parallel region may have >1 threads. Required to correctly implement omp_in_parallel function, at least.

> - Probably even more fundamental, what is encoded in the `parallelLevel` array? It is not (only) the parallel level, correct? It is per warp, correct?

It is only parallel level, and yes, per warp, required to handle L2+ parallelism. Because it is per warp, it is required to be volatile, because being combined with the atomic operations + fully inlined runtime functions, the memory ordering is not preserved, and thus it leads to undefined behavior in full runtime mode, which uses atomic operations.

>> This is especially important in case of thread divergence mixed with atomic operations.
> We do not use atomics on the `parallelLevel` do we?

It is used not for parallelLevel, but for dynamic scheduling and other operations, like SM management used in full runtime mode.

> On which level is the thread divergence, in a warp?


> And another fundamental question:
>  What is the connection of the test to the change? It looks totally unrelated as it does not check the parallel level at all.

It demonstrates the problem in SPMD mode + full runtime, which actively uses atomic operations. Dynamic scheduling forces using of atomics to demonstrate the problem.

  rOMP OpenMP



More information about the Openmp-commits mailing list