[Openmp-dev] Target CUDA RTL --> CUDA error is: an illegal memory access was encountered

Itaru Kitayama via Openmp-dev openmp-dev at lists.llvm.org
Sun Jan 19 17:01:00 PST 2020


I have updated the code, as a dynamically allocated array is in practice,
much important:
#include <stdio.h>
#include <iostream>

class AbsBase {
public:
        virtual int f() = 0;
virtual void map_in() = 0;
virtual void map_out() = 0;
};

class Derived : public AbsBase {
private:
        int *arr = new int[100];
        //int arr[100];
public:
        int f() { return arr[0]; }
        void fillarray() {
                arr[0] = 1234;
        }
virtual void map_in() {
#pragma omp target enter data map(to:this[0:1], this->arr[0:1])
}
virtual void map_out() {
#pragma omp target exit data map(from:this[0:1], this->arr[0:1])
}
};

int main() {
        AbsBase *absBase = new Derived();
static_cast<Derived*>(absBase)->fillarray();
static_cast<Derived*>(absBase)->map_in();
#pragma omp target parallel for
        for (int i=0;i<10;i++) {
                Derived d1(*static_cast<Derived*>(absBase));
                printf("arr[0] is %d\n", d1.f());
        }
static_cast<Derived*>(absBase)->map_out();
}

Here are the relevant errors:
[...]
Libomptarget --> Looking up mapping(HstPtrBegin=0x0000010023511268,
Size=8)...
Libomptarget --> Creating new map entry: HstBase=0x0000010023511260,
HstBegin=0x0000010023511268, HstEnd=0x0000010023511270,
TgtBegin=0x0000110048600000
Libomptarget --> There are 8 bytes allocated at target address
0x0000110048600000 - is new
Libomptarget --> Looking up mapping(HstPtrBegin=0x0000010023511260,
Size=16)...
Libomptarget --> WARNING: Pointer is not mapped but section extends into
already mapped data
Libomptarget --> Explicit extension of mapping is not allowed.
Libomptarget --> Call to getOrAllocTgtPtr returned null pointer (device
failure or illegal mapping).
Libomptarget --> There are 16 bytes allocated at target address
0x0000000000000000 - is new
Libomptarget --> Moving 16 bytes (hst:0x0000010023511260) ->
(tgt:0x0000000000000000)
Target CUDA RTL --> Error when copying data from host to device. Pointers:
host = 0x0000010023511260, device = 0x0000000000000000, size = 16
Target CUDA RTL --> CUDA error is: invalid argument
Libomptarget --> Copying data to device failed.
Libomptarget fatal error 1: failure of target construct while offloading is
mandatory
[...]

On Sun, Jan 19, 2020 at 3:26 PM Itaru Kitayama <itaru.kitayama at gmail.com>
wrote:

> Tom,
>
> Below builds, but fails at run time:
>
> #include <stdio.h>
>
> class AbsBase {
> public:
>         virtual int f() = 0;
> virtual void map_in() = 0;
> virtual void map_out() = 0;
> };
>
> class Derived : public AbsBase {
> private:
>         int *arr = new int[100];
> public:
>         int f() { return arr[0]; }
>         void fillarray() {
>                 arr[0] = 1234;
>         }
> virtual void map_in() {
> #pragma omp target enter data map(to:this[0:1], this->arr[0:1])
> }
> virtual void map_out() {
> #pragma omp target exit data map(from:this[0:1], this->arr[0:1])
> }
> };
>
> int main() {
>         AbsBase *absBase = new Derived();
> static_cast<Derived*>(absBase)->fillarray();
> static_cast<Derived*>(absBase)->map_in();
>
> #pragma omp target parallel for
>         for (int i=0;i<10;i++) {
>                 Derived d1(*static_cast<Derived*>(absBase));
>                 printf("arr[0] is %d\n", d1.f());
>         }
> absBase->map_out();
> }
>
> What's wrong with my code?
>
> [kitayama1 at juronc11 pcp0151]$ ./a.out
> Libomptarget --> Loading RTLs...
> Libomptarget --> Loading library 'libomptarget.rtl.ppc64.so'...
> Libomptarget --> Successfully loaded library 'libomptarget.rtl.ppc64.so'!
> Libomptarget --> Registering RTL libomptarget.rtl.ppc64.so supporting 4
> devices!
> Libomptarget --> Loading library 'libomptarget.rtl.x86_64.so'...
> Libomptarget --> Unable to load library 'libomptarget.rtl.x86_64.so':
> libomptarget.rtl.x86_64.so: cannot open shared object file: No such file
> or directory!
> Libomptarget --> Loading library 'libomptarget.rtl.cuda.so'...
> Target CUDA RTL --> Start initializing CUDA
> Libomptarget --> Successfully loaded library 'libomptarget.rtl.cuda.so'!
> Libomptarget --> Registering RTL libomptarget.rtl.cuda.so supporting 1
> devices!
> Libomptarget --> Loading library 'libomptarget.rtl.aarch64.so'...
> Libomptarget --> Unable to load library 'libomptarget.rtl.aarch64.so':
> libomptarget.rtl.aarch64.so: cannot open shared object file: No such file
> or directory!
> Libomptarget --> RTLs loaded!
> Libomptarget --> Image 0x0000000010001a90 is NOT compatible with RTL
> libomptarget.rtl.ppc64.so!
> Libomptarget --> Image 0x0000000010001a90 is compatible with RTL
> libomptarget.rtl.cuda.so!
> Libomptarget --> RTL 0x0000010000a6d6d0 has index 0!
> Libomptarget --> Registering image 0x0000000010001a90 with RTL
> libomptarget.rtl.cuda.so!
> Libomptarget --> Done registering entries!
> Libomptarget --> Call to omp_get_num_devices returning 1
> Libomptarget --> Default TARGET OFFLOAD policy is now mandatory (devices
> were found)
> Libomptarget --> Entering data begin region for device -1 with 3 mappings
> Libomptarget --> Use default device id 0
> Libomptarget --> Checking whether device 0 is ready.
> Libomptarget --> Is the device 0 (local ID 0) initialized? 0
> Target CUDA RTL --> Init requires flags to 1
> Target CUDA RTL --> Getting device 0
> Target CUDA RTL --> Max CUDA blocks per grid 2147483647 exceeds the hard
> team limit 65536, capping at the hard limit
> Target CUDA RTL --> Using 1024 CUDA threads per block
> Target CUDA RTL --> Max number of CUDA blocks 65536, threads 1024 & warp
> size 32
> Target CUDA RTL --> Default number of teams set according to library's
> default 128
> Target CUDA RTL --> Default number of threads set according to library's
> default 128
> Libomptarget --> Device 0 is ready to use.
> Target CUDA RTL --> Load data from image 0x0000000010001a90
> Target CUDA RTL --> CUDA module successfully loaded!
> Target CUDA RTL --> Entry point 0x0000000000000000 maps to
> __omp_offloading_30_809bddc5_main_l31 (0x00001100003a99d0)
> Target CUDA RTL --> Sending global device environment data 4 bytes
> Libomptarget --> Entry  0: Base=0x0000010000ac1260,
> Begin=0x0000010000ac1268, Size=8, Type=0x20
> Libomptarget --> Entry  1: Base=0x0000010000ac1260,
> Begin=0x0000010000ac1260, Size=16, Type=0x1000000000001
> Libomptarget --> Entry  2: Base=0x0000010000ac1268,
> Begin=0x0000010000ac1280, Size=4, Type=0x1000000000011
> Libomptarget --> Looking up mapping(HstPtrBegin=0x0000010000ac1268,
> Size=8)...
> Libomptarget --> Creating new map entry: HstBase=0x0000010000ac1260,
> HstBegin=0x0000010000ac1268, HstEnd=0x0000010000ac1270,
> TgtBegin=0x0000110048600000
> Libomptarget --> There are 8 bytes allocated at target address
> 0x0000110048600000 - is new
> Libomptarget --> Looking up mapping(HstPtrBegin=0x0000010000ac1260,
> Size=16)...
> Libomptarget --> WARNING: Pointer is not mapped but section extends into
> already mapped data
> Libomptarget --> Explicit extension of mapping is not allowed.
> Libomptarget --> Call to getOrAllocTgtPtr returned null pointer (device
> failure or illegal mapping).
> Libomptarget --> There are 16 bytes allocated at target address
> 0x0000000000000000 - is new
> Libomptarget --> Moving 16 bytes (hst:0x0000010000ac1260) ->
> (tgt:0x0000000000000000)
> Target CUDA RTL --> Error when copying data from host to device. Pointers:
> host = 0x0000010000ac1260, device = 0x0000000000000000, size = 16
> Target CUDA RTL --> CUDA error is: invalid argument
> Libomptarget --> Copying data to device failed.
> Libomptarget fatal error 1: failure of target construct while offloading
> is mandatory
> Libomptarget --> Unloading target library!
> Libomptarget --> Image 0x0000000010001a90 is compatible with RTL
> 0x0000010000a6d6d0!
> Libomptarget --> Unregistered image 0x0000000010001a90 from RTL
> 0x0000010000a6d6d0!
> Libomptarget --> Done unregistering images!
> Libomptarget --> Removing translation table for descriptor
> 0x000000001007dff8
> Libomptarget --> Done unregistering library!
>
> On Sat, Jan 18, 2020 at 11:49 AM Scogland, Tom <scogland1 at llnl.gov> wrote:
>
>> That's the map of "*this" talking, which we made possible in a newer
>> standard than is implemented.  If you change it over to mapping either this
>> by an array section or a reference to *this it should work.  The general
>> pattern is to use that kind of mapping method to get the correct static
>> type at the point of the map.
>>
>> -Tom
>> ------------------------------
>> *From:* Itaru Kitayama <itaru.kitayama at gmail.com>
>> *Sent:* Friday, January 17, 2020 5:42 PM
>> *To:* Doerfert, Johannes <jdoerfert at anl.gov>
>> *Cc:* Alexey Bataev <a.bataev at hotmail.com>; openmp-dev <
>> openmp-dev at lists.llvm.org>; Scogland, Tom <scogland1 at llnl.gov>
>> *Subject:* Re: [Openmp-dev] Target CUDA RTL --> CUDA error is: an
>> illegal memory access was encountered
>>
>> Johannes,
>> Tested with Trunk on POWER8:
>>
>> test5.cpp:20:40: error: expected expression containing only member
>> accesses and/or array sections based on named variables
>>                 #pragma omp target enter data map(to:*this,
>> this->arr[0:100])
>>                                                      ^~~~~
>> test5.cpp:23:40: error: expected expression containing only member
>> accesses and/or array sections based on named variables
>>                 #pragma omp target enter data map(to:*this,
>> this->arr[0:100])
>>
>>
>> On Sat, Jan 18, 2020 at 4:50 AM Doerfert, Johannes <jdoerfert at anl.gov>
>> wrote:
>>
>> Check the attached email from Tom for a way to make this work. Let me
>> know what you think
>>
>>
>>
>> ---------------------------------------
>> Johannes Doerfert
>> Researcher
>>
>> Argonne National Laboratory
>> Lemont, IL 60439, USA
>>
>> jdoerfert at anl.gov
>>
>> ________________________________________
>> From: Alexey Bataev <a.bataev at hotmail.com>
>> Sent: Friday, January 17, 2020 03:33
>> To: Itaru Kitayama
>> Cc: Doerfert, Johannes; openmp-dev
>> Subject: Re: [Openmp-dev] Target CUDA RTL --> CUDA error is: an illegal
>> memory access was encountered
>>
>> The compiler does not work the way you want. It is not an issue of OpenMP
>> but of C++ itself.
>>
>> Best regards,
>> Alexey Bataev
>>
>> 16 янв. 2020 г., в 23:20, Itaru Kitayama <itaru.kitayama at gmail.com>
>> написал(а):
>>
>> 
>> #include <stdio.h>
>>
>> class AbsBase {
>> public:
>>         virtual int f() = 0;
>> };
>>
>> class Derived : public AbsBase {
>> private:
>>         //int *arr = new int[100];
>>         int arr[100];
>> public:
>>         int f() { return arr[0]; }
>>         void fillarray() {
>>                 arr[0] = 1234;
>>         }
>> };
>>
>> int main() {
>>         AbsBase *absBase = new Derived();
>> Derived *p = static_cast<Derived*>(absBase);
>>         p->fillarray();
>> #pragma omp target parallel for map(to: p[0:1])
>>         for (int i=0;i<10;i++) {
>>                 //Derived d1(*static_cast<Derived*>(absBase));
>>                 Derived d1(*p);
>>                 printf("arr[0] is %d\n", d1.f());
>>         }
>> }
>>
>> Above gives me what I wanted to see, but I would like to avoid doing a
>> cast on a pointer
>> to an abstract base class, but the instantiation is done by the derived
>> class.
>>
>>
>> On Fri, Jan 17, 2020 at 6:07 PM Alexey Bataev <a.bataev at hotmail.com
>> <mailto:a.bataev at hotmail.com>> wrote:
>> AbsBase is a pointer to a base class? Then not, compiler is not aware
>> that it is a pointer to the derived class and copies the array as an array
>> of base class, nit derived one.
>>
>> Best regards,
>> Alexey Bataev
>>
>> 16 янв. 2020 г., в 22:58, Itaru Kitayama <itaru.kitayama at gmail.com
>> <mailto:itaru.kitayama at gmail.com>> написал(а):
>>
>> 
>> If I change class Derived's private data to an array of 100 integers, it
>> executes, but
>> the first element of the array is reported as 0, instead of 1234.
>> Shouldn't bitwise copy
>> work when mapping the pointer to AbsBase class like an array of one?
>>
>> On Wed, Jan 15, 2020 at 2:32 PM Alexey Bataev <a.bataev at hotmail.com
>> <mailto:a.bataev at hotmail.com>> wrote:
>> And it should fail since you're mapping non trivially copyable type.
>>
>> Best regards,
>> Alexey Bataev
>>
>> 14 янв. 2020 г., в 19:14, Itaru Kitayama <itaru.kitayama at gmail.com
>> <mailto:itaru.kitayama at gmail.com>> написал(а):
>>
>> 
>> This piece of C++ program execution fails at run time.
>>
>> #include <stdio.h>
>>
>> class AbsBase {
>> public:
>>         virtual int f() = 0;
>> };
>>
>> class Derived : public AbsBase {
>> private:
>>         int *arr = new int[100];
>> public:
>>         int f() { return arr[0]; }
>>         void fillarray() {
>>                 arr[0] = 1234;
>>         }
>> };
>>
>> int main() {
>>         AbsBase *absBase = new Derived();
>>         static_cast<Derived*>(absBase)->fillarray();
>> #pragma omp target parallel for map(to: absBase[0:1])
>>         for (int i=0;i<10;i++) {
>>                 Derived d1(*static_cast<Derived*>(absBase));
>>                 printf("arr[0] is %d\n", d1.f());
>>         }
>> }
>>
>> On Sun, Jan 12, 2020 at 1:45 PM Doerfert, Johannes <jdoerfert at anl.gov
>> <mailto:jdoerfert at anl.gov>> wrote:
>> On 01/12, Itaru Kitayama wrote:
>> > Do you guys have a timeframe for that feature
>> > Implemented?
>>
>> I do not and I don't know anyone who will drive this right now.
>>
>> As mentioned before, you should be able to "move/copy-create" the
>> elements on the device in order to use virtual functions.
>>
>>
>> > On Sun, Jan 12, 2020 at 12:51 Doerfert, Johannes <jdoerfert at anl.gov
>> <mailto:jdoerfert at anl.gov>> wrote:
>> >
>> > > On 01/11, Alexey Bataev via Openmp-dev wrote:
>> > > > Virtual functions are not supported.
>> > >
>> > > Not yet ;).
>> > >
>> > > We'll get it with 5.1 so we might actually implement it soon. Till
>> then,
>> > > you have to create the object on the device you call the virtual
>> > > function.
>> > >
>>
>> --
>>
>> Johannes Doerfert
>> Researcher
>>
>> Argonne National Laboratory
>> Lemont, IL 60439, USA
>>
>> jdoerfert at anl.gov<mailto:jdoerfert at anl.gov>
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/openmp-dev/attachments/20200120/7c2c59ad/attachment-0001.html>


More information about the Openmp-dev mailing list