[Openmp-commits] [PATCH] D51875: [OPENMP][NVPTX] Add support for lastprivates/reductions handling in SPMD constructs with lightweight runtime.
Alexey Bataev via Phabricator via Openmp-commits
openmp-commits at lists.llvm.org
Fri Sep 28 07:38:42 PDT 2018
ABataev added a comment.
In https://reviews.llvm.org/D51875#1249088, @ABataev wrote:
> In https://reviews.llvm.org/D51875#1249034, @Hahnfeld wrote:
> > In https://reviews.llvm.org/D51875#1249019, @ABataev wrote:
> > >
> > Yes, that's the current solution in Clang and actually what I described above:
> > In https://reviews.llvm.org/D51875#1248997, @Hahnfeld wrote:
> > > In SPMD constructs all CUDA threads are executing the `distribute` loop, but only the thread executing the last iteration of the `for` loop has seen the `lastprivate` value. However the information of which thread this is has been lost at the end of the `parallel` region. So data sharing is used to communicate the `lastprivate` value to all threads in the team that is executing the last `distribute` chunk.
> > I'm assuming that the pointer returned by `/* get data sharing frame from runtime */` is shared between all threads in a team.
> 1. It is not how clang works, it is how standard requires.
> 2. Yes, it is shared between all the threads in the team and this is how it is intended to be according to the standard
The main problem with your solution is that distribute loop does not have information which thread actually executed the last
chunk of the loop. All the threads in the last team must execute the same check and only one shall write its private value to the original variable. But, just like I said, runtime does not provide this information to the compiler
More information about the Openmp-commits