[Openmp-commits] [PATCH] D62393: [OPENMP][NVPTX]Mark parallel level counter as volatile.
Johannes Doerfert via Phabricator via Openmp-commits
openmp-commits at lists.llvm.org
Mon Jun 10 18:04:45 PDT 2019
jdoerfert added a comment.
In D62393#1533461 <https://reviews.llvm.org/D62393#1533461>, @ABataev wrote:
> In D62393#1533454 <https://reviews.llvm.org/D62393#1533454>, @jdoerfert wrote:
> > >> The standard has "levels" and "active-levels". Both need to be tracked but it seems we only have a single array?
> > > >
> > >
> > > They all are tracked on this array.
> > Where is that described?
> Sorry, read the code. I'm not a debugger.
First of all, this response is simply rude.
Second, you wrote the code so my question is well justified.
Third, does that mean you encode multiple values in a single int array but did not document this at all? The code is not documentation.
Finally, this will break once we have a parallel region in a recursive loop with more than 254 instantiations, correct?
>>> We don't need to synchronize the whole program, we just need to synchronize accesses to the parallelLevel array. In SPMD mode with full runtime (or SPMD mode eith atomic ops) the ptx compiler generates unsafe code for parallelLevel array without being marked as volatile (i.e. simple atomic-like construct in CUDa ).
>> Nobody argues we should synchronize the whole program. If you want to have atomic accesses to the parallelLevel array, make the accesses to the parallelLevel array atomic.
>> You cannot just say that the PTX compiler generates unsafe code. That is not an argument that is a guess.
> Volatile provides required level of atomicity per PTX ISA.
Arguably, atomic accesses provide atomicity. You also did never respond to the question which other accesses actually race with the ones to `parallelLevel` or why you do not synchronize the accesses to `parallelLevel` across the warp explicitly.
CHANGES SINCE LAST ACTION
More information about the Openmp-commits