[Openmp-commits] [PATCH] D50158: [OpenMP] Add placeholder functions for the depend and nowait depend clauses for target data directives.
Alexandre Eichenberger via Phabricator via Openmp-commits
openmp-commits at lists.llvm.org
Thu Aug 2 11:10:53 PDT 2018
AlexEichenberger added a comment.
**Implementation:**
First, there missing sync for the _tgt_target_data_begin_depend, __tgt_target_data_begin_depend_nowait, __tgt_target_data_update_depend, __tgt_target_data_update_depend_nowait,
Look at the end versions of the call for a possible implementation.
Second, it would be better IMO to use __kmpc_omp_wait_deps instead of the broader task wait.
**Larger picture:**
We negotiated these calls with Alexey: we could create target tasks for these, but Alexey estimated that creating target tasks for entries like begin/end/update, which have no target code, was significantly more complicated. And since it's not a big deal to do it in the runtime, that is why we agreed on having the async versions of these as entry points. If Alexey want to revisit his judgement, I have no issue with that. Its a lot of busy work for something that can easily be replaced in the runtime, but fine by me either ways.
**Bigger pictures:**
In https://reviews.llvm.org/D50158#1185943, @Hahnfeld wrote:
> When truly implementing asynchronous offloading libomptarget needs to tightly integrate into libomp's task scheduling. This will require a larger set of functions and I don't think we should change the interface just because we want to improve the implementation.
>
> Anyway it's libOMPtarget, I don't really see the point. The library isn't agnostic of the programming model, it obeys to `OMP_` environment variables - which btw. are also queried from libomp!
Agreed, and what is needed here is a way to launch a target task, have libOMP filter dependences that can be satisfied on the device and the one that cannot. The ones that can be satisfied on the device, we need an token that will in effect describe an event/stream combination for NVIDIA, and they will be considered as satisfied from the host's perspective. The ones that are not satisfied on the device, the hose will have to wait for these as usual. Then once all the host dependences are satisfied, the actual task can be launched with the device dependence (list of token to event/stream) being inserted by libomptarget.
The mechanism in libOMP is fairly unobtrusive, in that the dependence graph nodes need to be augmented by one device descriptor (host or device id) and one dependence descriptor (which is device dependent). When scanning a target task, the compiler has to create a task with extra data to maintain up to one pointer per dependence. So the filter add the dependences that can be satisfied by the device in that extra storage associated with the task. When the target task is ready to launch (i.e. all the host dependences are satisfied), libomptarget will get this list of filtered dependences and act accordingly.
Repository:
rOMP OpenMP
https://reviews.llvm.org/D50158
More information about the Openmp-commits
mailing list