[Openmp-dev] Libomptarget fatal error 1: '#pragma omp requires unified_shared_memory' not used consistently!

Mon Mar 1 14:35:44 PST 2021

I’m on JURECA and some nodes are attached to A100 GPUs.

On Tue, Mar 2, 2021 at 7:34 Itaru Kitayama <itaru.kitayama at gmail.com> wrote:

> Hi all,
> In the mean time, what do I do?
>
> On Tue, Mar 2, 2021 at 3:23 Johannes Doerfert <johannesdoerfert at gmail.com>
> wrote:
>
>> I think that is it. I heard of problems with our USM before.
>> We need to use the managed allocators if USM is active, they are
>> about to be upstreamed (I hope).
>>
>>
>> On 3/1/21 12:15 PM, Alexey.Bataev wrote:
>> > Looks like this example is for Explicit USM and I assume if you allocate
>> > the memory for a in managed memory explicitly, the OpenMP example also
>> > should work.
>> >
>> > There are other USM modes though, where the memory is shared implicitly
>> > between the host and the devices. Looks like currently LLVM
>> > implementation relies on this thing
>> >
>> https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#um-system-allocator
>> > where Implicit USM is supported.
>> >
>> > -------------
>> > Best regards,
>> > Alexey Bataev
>> >
>> > 3/1/2021 1:04 PM, Joachim Protze пишет:
>> >> Are the Kernel/Hardware requirements llvm specific?
>> >>
>> >> I can compile and execute the add_grid.cu example sucessfully:
>> >> https://developer.nvidia.com/blog/unified-memory-cuda-beginners/
>> >>
>> >> So, I would expect that an OpenMP program should also run sucessfully.
>> >>
>> >> - Joachim
>> >>
>> >>
>> >> Am 01.03.21 um 18:49 schrieb Alexey.Bataev:
>> >>> Hi, I you sure that you system supports Unified Shared Memory? As far
>> as
>> >>> I know it requires special linux kernel and the hardware must support
>> >>> it. If you system does not support it, the code will crash for sure at
>> >>> the runtime.
>> >>>
>> >>> In this mode, IIRC, we just ignore map clauses since the accelerator
>> >>> devices can access the host memory directly without the need for
>> >>> allocating the device-specific memory.
>> >>>
>> >>>
>> >>> -------------
>> >>> Best regards,
>> >>> Alexey Bataev
>> >>>
>> >>> 3/1/2021 12:41 PM, Joachim Protze пишет:
>> >>>> Hi all,
>> >>>>
>> >>>> even a more simple example segfaults, when the requires directive is
>> there:
>> >>>>
>> >>>> #include <iostream>
>> >>>> #include <omp.h>
>> >>>> #include <stdio.h>
>> >>>>
>> >>>> #pragma omp requires unified_shared_memory
>> >>>> #define N 1024
>> >>>>
>> >>>> int main() {
>> >>>>    int a[N];
>> >>>>    printf("a=%p\n", a);
>> >>>> #pragma omp target map(tofrom : a[0:N])
>> >>>>    {
>> >>>>      printf("a=%p\n", a);
>> >>>>      for (int i = 0; i < 1024; i++) {
>> >>>>        a[i] = 123;
>> >>>>      }
>> >>>>    }
>> >>>>    printf("a[0]=%i, a[%i]=%i\n", a[0], N/2, a[N/2]);
>> >>>> }
>> >>>>
>> >>>> The code runs sucessfully when the requires directive is removed
>> because
>> >>>> the mapping of `a` is explicitly specified.
>> >>>>
>> >>>> For this code to run successfully, would it be necessary to allocate
>> `a`
>> >>>> specially as cuda managed memory? I don't see any special treatment
>> of
>> >>>> `a` in llvm ir. As I understand the OpenMP spec, the requires
>> directive
>> >>>> should lead to a compile error if clang fails to generate such code.
>> >>>>
>> >>>> The requires example from the OpenMP Examples also fails with the
>> same
>> >>>> runtime error:
>> >>>>
>> >>>>
>> https://github.com/OpenMP/Examples/blob/main/sources/Example_requires.1.cpp
>> >>>>
>> >>>> - Joachim
>> >>>>
>> >>>> Am 28.02.21 um 11:12 schrieb Itaru Kitayama via Openmp-dev:
>> >>>>> This is the code:
>> >>>>>
>> >>>>> #include <iostream>
>> >>>>> #include <omp.h>
>> >>>>>
>> >>>>> #pragma omp requires unified_shared_memory
>> >>>>> #define N 1024
>> >>>>>
>> >>>>> int main() {
>> >>>>>    int a[N] = {0};
>> >>>>>    int *device_data =  new int[N];
>> >>>>> #pragma omp target map(tofrom : device_data[0:N])
>> >>>>>    {
>> >>>>>      device_data = &a[0];
>> >>>>>      for (int i = 0; i < 1024; i++) {
>> >>>>>        device_data[i] = 123;
>> >>>>>      }
>> >>>>>    }
>> >>>>>    std::cout << a[0] << std::endl;
>> >>>>> }
>> >>>>>
>> >>>>> On Sun, Feb 28, 2021 at 1:34 PM Johannes Doerfert
>> >>>>> <johannesdoerfert at gmail.com> wrote:
>> >>>>>> You have an illegal memory access, some memory is not properly
>> >>>>>> mapped.
>> >>>>>>
>> >>>>>>
>> >>>>>> On 2/27/21 7:47 PM, Itaru Kitayama wrote:
>> >>>>>>> Removed the internal function, but I get:
>> >>>>>>>
>> >>>>>>> CUDA device 0 info: Device supports up to 65536 CUDA blocks and
>> 1024
>> >>>>>>> threads with a warp size of 32
>> >>>>>>> CUDA device 0 info: Launching kernel
>> >>>>>>> __omp_offloading_34_8009dd23_main_l12 with 1 blocks and 33
>> threads in
>> >>>>>>> Generic mode
>> >>>>>>> CUDA error: Error when synchronizing stream. stream =
>> >>>>>>> 0x0000000001d22ae0, async info ptr = 0x00007ffe73ea2728
>> >>>>>>> CUDA error: an illegal memory access was encountered
>> >>>>>>> Libomptarget error: Failed to synchronize device.
>> >>>>>>> Libomptarget error: Call to targetDataEnd failed, abort target.
>> >>>>>>> Libomptarget error: Failed to process data after launching the
>> kernel.
>> >>>>>>> Libomptarget error: Source location information not present.
>> Compile
>> >>>>>>> with -g or -gline-tables-only.
>> >>>>>>> Libomptarget fatal error 1: failure of target construct while
>> >>>>>>> offloading is mandatory
>> >>>>>>> /var/spool/parastation/jobs/8941317: line 23: 20812 Aborted
>> >>>>>>>        (core dumped) ./a.out
>> >>>>>>>
>> >>>>>>> On Sun, Feb 28, 2021 at 10:35 AM Alexey Bataev <
>> a.bataev at hotmail.com> wrote:
>> >>>>>>>> Do not call __tgt_register_requires directly, this is the
>> internal function called by global constructor and its arg value depends on
>> #pragma omp requires. Use just this pragma.
>> >>>>>>>>
>> >>>>>>>> Best regards,
>> >>>>>>>> Alexey Bataev
>> >>>>>>>>
>> >>>>>>>>> 27 февр. 2021 г., в 20:28, Itaru Kitayama via Openmp-dev <
>> openmp-dev at lists.llvm.org> написал(а):
>> >>>>>>>>>
>> >>>>>>>>> I'm trying to build a test C++ code that uses part of
>> >>>>>>>>> unified_shared_memory/shared_update.c
>> >>>>>>>>>
>> >>>>>>>>>> On Sun, Feb 28, 2021 at 10:25 AM Johannes Doerfert
>> >>>>>>>>>> <johannesdoerfert at gmail.com> wrote:
>> >>>>>>>>>>
>> >>>>>>>>>> I don't see this test, nor do I understand what you are trying
>> to say.
>> >>>>>>>>>> Is the test failing? If so, which test is this?
>> >>>>>>>>>>
>> >>>>>>>>>> ~ Johannes
>> >>>>>>>>>>
>> >>>>>>>>>>
>> >>>>>>>>>>> On 2/27/21 7:17 PM, Itaru Kitayama via Openmp-dev wrote:
>> >>>>>>>>>>> The below C++ code builds, but the executable fails at
>> runtime.
>> >>>>>>>>>>> (It is taken from the C code under the libomptarget subdir's
>> test directory)
>> >>>>>>>>>>>
>> >>>>>>>>>>> #include <omp.h>
>> >>>>>>>>>>>
>> >>>>>>>>>>> #pragma omp requires unified_shared_memory
>> >>>>>>>>>>> #define N 1024
>> >>>>>>>>>>> extern "C" void __tgt_register_requires(int64_t);
>> >>>>>>>>>>>
>> >>>>>>>>>>> int main() {
>> >>>>>>>>>>>
>> >>>>>>>>>>>     int a[N] = {0};
>> >>>>>>>>>>>     int b[N] = {0};
>> >>>>>>>>>>>     int *device_data;
>> >>>>>>>>>>>     __tgt_register_requires(1);
>> >>>>>>>>>>> #pragma omp target map(tofrom : device_data)
>> >>>>>>>>>>>     {
>> >>>>>>>>>>>       device_data = &a[0];
>> >>>>>>>>>>>       for (int i = 0; i < 1024; i++) {
>> >>>>>>>>>>>         a[i] += 1;
>> >>>>>>>>>>>       }
>> >>>>>>>>>>>     }
>> >>>>>>>>>>> }
>> >>>>>>>>>>> _______________________________________________
>> >>>>>>>>>>> Openmp-dev mailing list
>> >>>>>>>>>>> Openmp-dev at lists.llvm.org
>> >>>>>>>>>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/openmp-dev
>> >>>>>>>>> _______________________________________________
>> >>>>>>>>> Openmp-dev mailing list
>> >>>>>>>>> Openmp-dev at lists.llvm.org
>> >>>>>>>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/openmp-dev
>> >>>>> _______________________________________________
>> >>>>> Openmp-dev mailing list
>> >>>>> Openmp-dev at lists.llvm.org
>> >>>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/openmp-dev
>> >>>>>
>> >
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/openmp-dev/attachments/20210302/0cd15260/attachment-0001.html>