[cfe-dev] [Openmp-dev] Discussion about OpenMP 5.0 declare mapper runtime interface

Mon Sep 30 15:34:44 PDT 2019

On 09/30, Finkel, Hal J. wrote:
> 
> On 9/30/19 3:49 PM, Lingda Li via Openmp-dev wrote:
> 
>     Hi,
> 
>     I would like to bring your attention to the choice of 2 proposals for the
>     declare mapper runtime interface:
> 
>     1. The current design which creates new runtime functions for declare
>     mappers. For example, right now we have `__tgt_target_teams(...)` which
>     corresponds to the runtime interface for `omp target teams`. Now we add
>     `__tgt_target_teams_mapper(..., void **mappers)` to replace it.
> 
>     As a result, the old interfaces will be deprecated, but they need to be
>     kept there for backward compatibility. I think this scheme is clear and has
>     no hidden problems. The down side is it will create more OpenMP runtime
>     interfaces. The patches for this scheme can be found at https://
>     reviews.llvm.org/D67833 and https://reviews.llvm.org/D68100.
> 
>     2. Introduce a function `__tgt_push_mappers`, which should be called before
>     every target function call (e.g., `__tgt_target_teams`) to pass the mapper
>     argument for that function. The call of `__tgt_push_mappers` is implicitly
>     bonded with the actual target call.
> 
>     This scheme will introduce less runtime interfaces. Its problem is the
>     implementation is not straightforward and needs to take extra precautions.
>     For example, each OpenMP task should have a separate mapper storage, to
>     prevent a target region from reading the mapper written by another task.
> 
>     Your option about which one is better will be greatly appreciated.
> 
> 
> I prefer option 1. I don't like the idea of introducing extra mandatory state
> in the library simply to avoid adding a new runtime call with additional
> parameters. My experience is that this kind of extra state is more likely to be
> a source of bugs than the extra parameters of a new function, plus the extra
> function-call overhead and memory traffic is likely more expensive than the
> extra parameter passing. We need to extend the API either way.
> 
> Furthermore, I suspect that breaking of the runtime call into several stateful
> calls will make the calls more difficult to analyze within the optimizer.
> Johannes, do you agree?
> 
> I realize that we've done this kind of thing in the past - and similar things
> exist in the CUDA runtime interface, etc. - but my experience with this kind of
> API extension has been largely negative.

I totally agree, option 1 is preferable.

For other situations it is already clear that multiple calls make
optimizations harder and that we will introduce "meta" calls to
encapsulate intent and not implementation details of the runtime. (The
next on my list is task_alloc(...) and the subsequent task(...), but
TRegions were another example.)

The argument that there will be "more runtime interface" is, especially
in this case, not a strong one given that we can define old functions in
terms of the new ones with basically no maintenance overhead:

__tgt_target_teams(...) = __tgt_target_teams_mappers(..., nullptr);

Cheers,
  Johannes

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 228 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20190930/127faf12/attachment.sig>