[Openmp-commits] [PATCH] D92197: [OpenMP] Avoid internal calls to external compiler interface (kmpc)

Joachim Protze via Phabricator via Openmp-commits openmp-commits at lists.llvm.org
Tue Dec 1 12:54:21 PST 2020


protze.joachim added a comment.

In D92197#2425687 <https://reviews.llvm.org/D92197#2425687>, @jdoerfert wrote:

> I'm not convinced. I didn't go over everything but I think we should discuss the goal and choices here first.
>
> I checked the `OmptReturnAddressGuard` and it looks like it is a noop to store a return address if one is set already, right? (assuming same thread)
> Could we unconditionally execute `OMPT_STORE_RETURN_ADDRESS(gtid);` instead to avoid the `_aux` stuff?

The store only if not yet stored approach is a hack necessary to work around the issue that the compiler facing interface of the runtime is also called internally and therefore we cannot determine the caller.
This approach fails if the runtime misses to reset the stored value, when passing control back to the application:

- when returning to the application (solved by the guards)
- when calling into the application to execute outlined code

Especially for barriers, we need the return address in different runtime functions and also after potentially calling into the application many times.

With a clear separation of the external and internal interface, we can unconditionally push the return address onto the threadlocal stack when entering the runtime trough the external interface.
The bigger deal are the frame pointers. There the assertions effectively only apply with this separated, external use of the interface.

> (Partially related, @AndreyChurbanov  I'd argue the branch in `OmptReturnAddressGuard` is `UNLIKELY` as well, basically all that check `ompt_enabled.enabled` are?)

I agree, that we can put UNLIKELY there. When we tuned the functions for _if0 tasks with templates some years ago, I did some epcc benchmark experiments, which showed that putting UNLIKELY for every `ompt_enabled.enabled` does not always improve performance, but sometimes actually degrades performance.



================
Comment at: openmp/runtime/src/kmp_cancel.cpp:30
+kmp_int32 __forceinline __kmp_cancel_impl(ident_t *loc_ref, kmp_int32 gtid,
+                                          kmp_int32 cncl_kind) {
   kmp_info_t *this_thr = __kmp_threads[gtid];
----------------
jdoerfert wrote:
> No `__forceinline` please, also elsewhere.
There are already many `__forceinline` in this file. Some dating back to Jim Cownies initial commit.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D92197/new/

https://reviews.llvm.org/D92197



More information about the Openmp-commits mailing list