[Openmp-commits] [PATCH] D65013: [OPENMP][NVPTX]Fix parallel level counter in Cuda 9.0.

Johannes Doerfert via Phabricator via Openmp-commits openmp-commits at lists.llvm.org
Tue Jul 23 03:57:04 PDT 2019

jdoerfert added a comment.

In D65013#1596990 <https://reviews.llvm.org/D65013#1596990>, @ABataev wrote:

> In D65013#1596821 <https://reviews.llvm.org/D65013#1596821>, @jdoerfert wrote:
> > I'm confused, partly about the "convergent" part.
> >
> > The code looks vastly different but no tests are affected?
> >  Could you please point out how to reproduce the problem?
> >  Where did the shuffles go?
> >  Why is there a threadfence and syncwrap now?
> >  Which old accesses were problematic and why?
> There is a problem with at least 1 test in Cuda 9+: spmd_parallel_regions.cpp. To fix this problem we need 3 things: fix the test itself (see D65112 <https://reviews.llvm.org/D65112>), fix the runtime part (this patch) and fix the handling of critical sections in compiler (the 3rd patch that depends on this one).

There seems to be a problem with this "fix", not the test. At least so far, the argument was CUDA 9 semantics which should be irrelevant to the test. If there is a problem, than that the runtime doesn't implement OpenMP semantics properly for that test. Modifying the test will only hide that problem.

  rOMP OpenMP



More information about the Openmp-commits mailing list