[Openmp-commits] [PATCH] D92197: [OpenMP] Avoid internal calls to external compiler interface (kmpc)

Johannes Doerfert via Phabricator via Openmp-commits openmp-commits at lists.llvm.org
Tue Dec 1 23:34:14 PST 2020


jdoerfert added a comment.

In D92197#2426493 <https://reviews.llvm.org/D92197#2426493>, @protze.joachim wrote:

> In D92197#2425687 <https://reviews.llvm.org/D92197#2425687>, @jdoerfert wrote:
>
>> I'm not convinced. I didn't go over everything but I think we should discuss the goal and choices here first.
>>
>> I checked the `OmptReturnAddressGuard` and it looks like it is a noop to store a return address if one is set already, right? (assuming same thread)
>> Could we unconditionally execute `OMPT_STORE_RETURN_ADDRESS(gtid);` instead to avoid the `_aux` stuff?
>
> The store only if not yet stored approach is a hack necessary to work around the issue that the compiler facing interface of the runtime is also called internally and therefore we cannot determine the caller.
> This approach fails if the runtime misses to reset the stored value, when passing control back to the application:
>
> - when returning to the application (solved by the guards)
> - when calling into the application to execute outlined code
>
> Especially for barriers, we need the return address in different runtime functions and also after potentially calling into the application many times.
>
> With a clear separation of the external and internal interface, we can unconditionally push the return address onto the threadlocal stack when entering the runtime trough the external interface.
> The bigger deal are the frame pointers. There the assertions effectively only apply with this separated, external use of the interface.

Hm,.. I doubt I understand the problem fully. Could you present it in our wednesday meeting at some point (in ~7.5h or a week)? 
FWIW, one reason I'm hesitant is that this duplication has to be continued as new APIs come in.

>> (Partially related, @AndreyChurbanov  I'd argue the branch in `OmptReturnAddressGuard` is `UNLIKELY` as well, basically all that check `ompt_enabled.enabled` are?)
>
> I agree, that we can put UNLIKELY there. When we tuned the functions for _if0 tasks with templates some years ago, I did some epcc benchmark experiments, which showed that putting UNLIKELY for every `ompt_enabled.enabled` does not always improve performance, but sometimes actually degrades performance.

If it does on average I'd do it. In the long run the compiler should improve in its use of hints; at least I hope ;)



================
Comment at: openmp/runtime/src/kmp_cancel.cpp:30
+kmp_int32 __forceinline __kmp_cancel_impl(ident_t *loc_ref, kmp_int32 gtid,
+                                          kmp_int32 cncl_kind) {
   kmp_info_t *this_thr = __kmp_threads[gtid];
----------------
protze.joachim wrote:
> jdoerfert wrote:
> > No `__forceinline` please, also elsewhere.
> There are already many `__forceinline` in this file. Some dating back to Jim Cownies initial commit.
This is a MSCV extension, at the very least we should use something like described in the wikipedia article: https://en.wikipedia.org/wiki/Inline_function#Nonstandard_extensions

Preferably, we would not need to do any of it though, arguably the compiler should figure this out. What we should do instead is use (un)likely and hot/cold annotations, IMHO. [Providing information is good, assuming you know better means you should work on the inliner heuristic ;)]


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D92197/new/

https://reviews.llvm.org/D92197



More information about the Openmp-commits mailing list