[Openmp-dev] Libomptarget fatal error 1: '#pragma omp requires unified_shared_memory' not used consistently!

Itaru Kitayama via Openmp-dev openmp-dev at lists.llvm.org
Wed Mar 10 12:29:58 PST 2021


All,

I've checked out the Examples repo and tried to see if the requires
directive works or not on A100, which supports USM; failed.

Is this to be addressed soon?

On Tue, Mar 2, 2021 at 2:41 AM Joachim Protze <protze.joachim at gmail.com> wrote:
>
> Hi all,
>
> even a more simple example segfaults, when the requires directive is there:
>
> #include <iostream>
> #include <omp.h>
> #include <stdio.h>
>
> #pragma omp requires unified_shared_memory
> #define N 1024
>
> int main() {
>   int a[N];
>   printf("a=%p\n", a);
> #pragma omp target map(tofrom : a[0:N])
>   {
>     printf("a=%p\n", a);
>     for (int i = 0; i < 1024; i++) {
>       a[i] = 123;
>     }
>   }
>   printf("a[0]=%i, a[%i]=%i\n", a[0], N/2, a[N/2]);
> }
>
> The code runs sucessfully when the requires directive is removed because
> the mapping of `a` is explicitly specified.
>
> For this code to run successfully, would it be necessary to allocate `a`
> specially as cuda managed memory? I don't see any special treatment of
> `a` in llvm ir. As I understand the OpenMP spec, the requires directive
> should lead to a compile error if clang fails to generate such code.
>
> The requires example from the OpenMP Examples also fails with the same
> runtime error:
>
> https://github.com/OpenMP/Examples/blob/main/sources/Example_requires.1.cpp
>
> - Joachim
>
> Am 28.02.21 um 11:12 schrieb Itaru Kitayama via Openmp-dev:
> > This is the code:
> >
> > #include <iostream>
> > #include <omp.h>
> >
> > #pragma omp requires unified_shared_memory
> > #define N 1024
> >
> > int main() {
> >   int a[N] = {0};
> >   int *device_data =  new int[N];
> > #pragma omp target map(tofrom : device_data[0:N])
> >   {
> >     device_data = &a[0];
> >     for (int i = 0; i < 1024; i++) {
> >       device_data[i] = 123;
> >     }
> >   }
> >   std::cout << a[0] << std::endl;
> > }
> >
> > On Sun, Feb 28, 2021 at 1:34 PM Johannes Doerfert
> > <johannesdoerfert at gmail.com> wrote:
> >>
> >> You have an illegal memory access, some memory is not properly
> >> mapped.
> >>
> >>
> >> On 2/27/21 7:47 PM, Itaru Kitayama wrote:
> >>> Removed the internal function, but I get:
> >>>
> >>> CUDA device 0 info: Device supports up to 65536 CUDA blocks and 1024
> >>> threads with a warp size of 32
> >>> CUDA device 0 info: Launching kernel
> >>> __omp_offloading_34_8009dd23_main_l12 with 1 blocks and 33 threads in
> >>> Generic mode
> >>> CUDA error: Error when synchronizing stream. stream =
> >>> 0x0000000001d22ae0, async info ptr = 0x00007ffe73ea2728
> >>> CUDA error: an illegal memory access was encountered
> >>> Libomptarget error: Failed to synchronize device.
> >>> Libomptarget error: Call to targetDataEnd failed, abort target.
> >>> Libomptarget error: Failed to process data after launching the kernel.
> >>> Libomptarget error: Source location information not present. Compile
> >>> with -g or -gline-tables-only.
> >>> Libomptarget fatal error 1: failure of target construct while
> >>> offloading is mandatory
> >>> /var/spool/parastation/jobs/8941317: line 23: 20812 Aborted
> >>>       (core dumped) ./a.out
> >>>
> >>> On Sun, Feb 28, 2021 at 10:35 AM Alexey Bataev <a.bataev at hotmail.com> wrote:
> >>>> Do not call __tgt_register_requires directly, this is the internal function called by global constructor and its arg value depends on #pragma omp requires. Use just this pragma.
> >>>>
> >>>> Best regards,
> >>>> Alexey Bataev
> >>>>
> >>>>> 27 февр. 2021 г., в 20:28, Itaru Kitayama via Openmp-dev <openmp-dev at lists.llvm.org> написал(а):
> >>>>>
> >>>>> I'm trying to build a test C++ code that uses part of
> >>>>> unified_shared_memory/shared_update.c
> >>>>>
> >>>>>> On Sun, Feb 28, 2021 at 10:25 AM Johannes Doerfert
> >>>>>> <johannesdoerfert at gmail.com> wrote:
> >>>>>>
> >>>>>> I don't see this test, nor do I understand what you are trying to say.
> >>>>>> Is the test failing? If so, which test is this?
> >>>>>>
> >>>>>> ~ Johannes
> >>>>>>
> >>>>>>
> >>>>>>> On 2/27/21 7:17 PM, Itaru Kitayama via Openmp-dev wrote:
> >>>>>>> The below C++ code builds, but the executable fails at runtime.
> >>>>>>> (It is taken from the C code under the libomptarget subdir's test directory)
> >>>>>>>
> >>>>>>> #include <omp.h>
> >>>>>>>
> >>>>>>> #pragma omp requires unified_shared_memory
> >>>>>>> #define N 1024
> >>>>>>> extern "C" void __tgt_register_requires(int64_t);
> >>>>>>>
> >>>>>>> int main() {
> >>>>>>>
> >>>>>>>    int a[N] = {0};
> >>>>>>>    int b[N] = {0};
> >>>>>>>    int *device_data;
> >>>>>>>    __tgt_register_requires(1);
> >>>>>>> #pragma omp target map(tofrom : device_data)
> >>>>>>>    {
> >>>>>>>      device_data = &a[0];
> >>>>>>>      for (int i = 0; i < 1024; i++) {
> >>>>>>>        a[i] += 1;
> >>>>>>>      }
> >>>>>>>    }
> >>>>>>> }
> >>>>>>> _______________________________________________
> >>>>>>> Openmp-dev mailing list
> >>>>>>> Openmp-dev at lists.llvm.org
> >>>>>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/openmp-dev
> >>>>> _______________________________________________
> >>>>> Openmp-dev mailing list
> >>>>> Openmp-dev at lists.llvm.org
> >>>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/openmp-dev
> > _______________________________________________
> > Openmp-dev mailing list
> > Openmp-dev at lists.llvm.org
> > https://lists.llvm.org/cgi-bin/mailman/listinfo/openmp-dev
> >
>


More information about the Openmp-dev mailing list