[Openmp-commits] [PATCH] D65013: [OPENMP][NVPTX]Fix parallel level counter in Cuda 9.0.
Johannes Doerfert via Phabricator via Openmp-commits
openmp-commits at lists.llvm.org
Tue Jul 23 03:57:04 PDT 2019
jdoerfert added a comment.
In D65013#1596990 <https://reviews.llvm.org/D65013#1596990>, @ABataev wrote:
> In D65013#1596821 <https://reviews.llvm.org/D65013#1596821>, @jdoerfert wrote:
> > I'm confused, partly about the "convergent" part.
> > The code looks vastly different but no tests are affected?
> > Could you please point out how to reproduce the problem?
> > Where did the shuffles go?
> > Why is there a threadfence and syncwrap now?
> > Which old accesses were problematic and why?
> There is a problem with at least 1 test in Cuda 9+: spmd_parallel_regions.cpp. To fix this problem we need 3 things: fix the test itself (see D65112 <https://reviews.llvm.org/D65112>), fix the runtime part (this patch) and fix the handling of critical sections in compiler (the 3rd patch that depends on this one).
There seems to be a problem with this "fix", not the test. At least so far, the argument was CUDA 9 semantics which should be irrelevant to the test. If there is a problem, than that the runtime doesn't implement OpenMP semantics properly for that test. Modifying the test will only hide that problem.
CHANGES SINCE LAST ACTION
More information about the Openmp-commits