[Openmp-commits] [PATCH] D50522: [OpenMP][libomptarget] Bringing up to spec with respect to OMP_TARGET_OFFLOAD env var

Jonas Hahnfeld via Phabricator via Openmp-commits openmp-commits at lists.llvm.org
Fri Aug 10 09:13:45 PDT 2018


Hahnfeld added a comment.

In https://reviews.llvm.org/D50522#1195317, @AlexEichenberger wrote:

> I disagree. As it is currently written, omp_get_num_devices() can also grow over time, so you can also have it return zero, only to increase later to a larger number. What does that do to DEFAULT?


Ok, fair point. I think we need to decide on the first entry from user code: If one construct fell back to the host all following constructs should, shouldn't they?

> I believe the current policy is ok; if any device fails for any reason on first use, it becomes disabled. Anyone that want to rely on devices being there should use the MANDATORY policy.

Huh, but there is only one global `TargetOffloadPolicy`. So how can we disable it = a single device?

In https://reviews.llvm.org/D50522#1194903, @Hahnfeld wrote:

> Suppose we have 2 devices plugged into the system, and the first one cannot be used (for whatever reason: hardware failure, exclusive configuration and somebody else is running, etc.).
>  Now a clever application sees the two (because `omp_get_num_devices()` returns 2) and does:
>
>   #pragma omp parallel num_threads(omp_get_num_devices())
>   {
>     #pragma omp target device(omp_get_thread_num())
>     { }
>   }
>
>
> I think the runtime behaviour with this patch depends on the execution order (and exposes a race condition in `handle_target_outcome` on `TargetOffloadPolicy`; let's ignore that for now):
>
> - If `target device(0)` executes first, libomptarget will notice the error and silently disable offloading. All `target` regions will execute on the host.
> - If however `target device(1)` executes first and returns successfully, libomptarget will raise `OMP_TARGET_OFFLOAD` to `MANDATORY` and will abort execution when catching the error of `target device(0)`. I don't think that makes much sense. IMO the runtime should detect two "visible" -> "available" devices and abort execution in all cases.


Did you consider this example?


Repository:
  rOMP OpenMP

https://reviews.llvm.org/D50522





More information about the Openmp-commits mailing list