[Openmp-dev] Compiling OpenMP code with CUDA support using libc++

Thu Mar 5 12:36:29 PST 2020

2. Yes. I forgot that I changed the static array to dynamic. So my mapping
is wrong. I just changed it to the format like x[:N], but the error still
occurred.

I'll search more to see if it's due to cuda 10.2.  Thanks!

Alexey.Bataev <a.bataev at outlook.com> 于2020年3月5日周四 下午2:31写道：

> There are several issues with your code.
>
> 1. There are some issues with cuda 10.2, clang does not support it in full
> yet. There were discussions about how to support it, you can search the
> list.
>
> 2. Seems to me, the example is not correct. You need to map array
> sections, like map(x[:N], y[:n], z[:N]), otherwise the memory allocated for
> the arrays won't be mapped correctly, only pointers are mapped.
>
> -------------
> Best regards,
> Alexey Bataev
>
> 05.03.2020 3:22 PM, G Zhao пишет:
>
> Yeah, the example code is below:
>
>   int N = 1<<20;
>
>   float *x = new float[N];
>   float *y = new float[N];
>
>   for (int i = 0; i < N; i++) {
>     x[i] = 1.0f;
>     y[i] = 2.0f;
>   }
>
>   float *z = new float[N];
>   int i;
>   #pragma omp target map(x, y, z)
>   #pragma omp parallel for
>   for (i=0; i < N; i++) {
>     z[i] = x[i] + y[i];
>   }
>
> I just grab a piece of code from
> https://www.openmp.org/wp-content/uploads/openmp-examples-4.5.0.pdf for
> testing. I also tested other examples in that document, but none worked.
> Initially, I was working on a piece of code from a legacy project.
>
> BTW, when I compiled the example code, I got some warnings:
>
> clang-11: warning: Unknown CUDA version 10.2. Assuming the latest
> supported version 10.1 [-Wunknown-cuda-version]
> clang-11: warning: Unknown CUDA version 10.2. Assuming the latest
> supported version 10.1 [-Wunknown-cuda-version]
> clang-11: warning: No library 'libomptarget-nvptx-sm_35.bc' found in the
> default clang lib directory or in LIBRARY_PATH. Expect degraded performance
> due to no inlining of runtime functions on target devices. [-Wopenmp-target]
>
> I am not sure if the error is caused by the newer CUDA version (I thought
> 10.2 should be compatible with 10.1)
>
> Thanks!
>
> Gang Zhao
>
> Alexey.Bataev <a.bataev at outlook.com> 于2020年3月5日周四 下午12:59写道：
>
>> Could you provide an example how do you map the data in the target region?
>>
>> -------------
>> Best regards,
>> Alexey Bataev
>>
>> 05.03.2020 1:45 PM, G Zhao пишет:
>>
>> Thanks! I didn't notice this. The code is from a legacy project and I
>> just checked all the target regions. It did use STL vectors. I just
>> replaced all those vectors with arrays. Now it can successfully compile.
>> But When I run it, there is another error:
>>
>> Libomptarget fatal error 1: failure of target construct while offloading
>> is mandatory
>>
>> I tried it on a simple vector add example, and got the same error.
>>
>> Below is the debug information with LIBOMPTARGET_DEBUG=1:
>>
>> Libomptarget --> Loading RTLs...
>> Libomptarget --> Loading library 'libomptarget.rtl.ppc64.so'...
>> Libomptarget --> Unable to load library 'libomptarget.rtl.ppc64.so':
>> libomptarget.rtl.ppc64.so: cannot open shared object file: No such file
>> or directory!
>> Libomptarget --> Loading library 'libomptarget.rtl.x86_64.so'...
>> Libomptarget --> Successfully loaded library 'libomptarget.rtl.x86_64.so
>> '!
>> Libomptarget --> Registering RTL libomptarget.rtl.x86_64.so supporting 4
>> devices!
>> Libomptarget --> Loading library 'libomptarget.rtl.cuda.so'...
>> Target CUDA RTL --> Start initializing CUDA
>> Libomptarget --> Successfully loaded library 'libomptarget.rtl.cuda.so'!
>> Libomptarget --> Registering RTL libomptarget.rtl.cuda.so supporting 1
>> devices!
>> Libomptarget --> Loading library 'libomptarget.rtl.aarch64.so'...
>> Libomptarget --> Unable to load library 'libomptarget.rtl.aarch64.so':
>> libomptarget.rtl.aarch64.so: cannot open shared object file: No such
>> file or directory!
>> Libomptarget --> RTLs loaded!
>> Libomptarget --> Image 0x000000000041ad20 is NOT compatible with RTL
>> libomptarget.rtl.x86_64.so!
>> Libomptarget --> Image 0x000000000041ad20 is compatible with RTL
>> libomptarget.rtl.cuda.so!
>> Libomptarget --> RTL 0x00000000015b3c40 has index 0!
>> Libomptarget --> Registering image 0x000000000041ad20 with RTL
>> libomptarget.rtl.cuda.so!
>> Libomptarget --> Done registering entries!
>> Libomptarget --> Call to omp_get_num_devices returning 1
>> Libomptarget --> Default TARGET OFFLOAD policy is now mandatory (devices
>> were found)
>> Libomptarget --> Checking whether device 0 is ready.
>> Libomptarget --> Is the device 0 (local ID 0) initialized? 0
>> Target CUDA RTL --> Init requires flags to 1
>> Target CUDA RTL --> Getting device 0
>> Target CUDA RTL --> Max CUDA blocks per grid 2147483647 exceeds the hard
>> team limit 65536, capping at the hard limit
>> Target CUDA RTL --> Using 1024 CUDA threads per block
>> Target CUDA RTL --> Max number of CUDA blocks 65536, threads 1024 & warp
>> size 32
>> Target CUDA RTL --> Default number of teams set according to library's
>> default 128
>> Target CUDA RTL --> Default number of threads set according to library's
>> default 128
>> Libomptarget --> Device 0 is ready to use.
>> Target CUDA RTL --> Load data from image 0x000000000041ad20
>> Target CUDA RTL --> Error when loading CUDA module
>> Target CUDA RTL --> CUDA error is: device kernel image is invalid
>> Libomptarget --> Unable to generate entries table for device id 0.
>> Libomptarget --> Failed to init globals on device 0
>> Libomptarget --> Failed to get device 0 ready
>> Libomptarget fatal error 1: failure of target construct while offloading
>> is mandatory
>> Libomptarget --> Unloading target library!
>> Libomptarget --> Image 0x000000000041ad20 is compatible with RTL
>> 0x00000000015b3c40!
>> Libomptarget --> Unregistered image 0x000000000041ad20 from RTL
>> 0x00000000015b3c40!
>> Libomptarget --> Done unregistering images!
>> Libomptarget --> Removing translation table for descriptor
>> 0x0000000000440810
>> Libomptarget --> Done unregistering library!
>> Libomptarget --> Deinit target library!
>>
>> Any hints about this?
>>
>> Regards,
>> Gang Zhao
>>
>> Alexey Bataev <a.bataev at hotmail.com> 于2020年3月5日周四 上午5:44写道：
>>
>>> Neither libc++, nor libstdc++ cannot be linked with nvlink. NVidia does
>>> not provide implementations for either libc++, or libstdc++. You must
>>> exclude the use of the standard c++ library from target regions.
>>>
>>> Best regards,
>>> Alexey Bataev
>>>
>>> > 5 марта 2020 г., в 00:25, G Zhao via Openmp-dev <
>>> openmp-dev at lists.llvm.org> написал(а):
>>> >
>>> > 
>>> > Hi,
>>> >
>>> > I just compiled LLVM and enable NVPTX by
>>> -DLLVM_TARGETS_TO_BUILD="X86;NVPTX". But when I compiled my code using the
>>> below command:
>>> >
>>> > clang++ main.cpp -fopenmp -fopenmp-targets=nvptx64-nvidia-cuda -o
>>> a_gpu.exe
>>> >
>>> > I got the below error:
>>> >
>>> >
>>> /usr/lib64/gcc/x86_64-pc-linux-gnu/9.2.1/../../../../include/c++/9.2.1/bits/std_abs.h:75:3:
>>> error: declaration conflicts with target of using declaration already in
>>> scope
>>> >   abs(float __x)
>>> >
>>> > I think the reason is I am using GCC-9. I did a bit search and someone
>>> said using libc++ can address this. So I compiled libcxx and libcxxabi, and
>>> used the below command to compile my code again:
>>> >
>>> > clang++ stdlib=libc++  main.cpp -fopenmp
>>> -fopenmp-targets=nvptx64-nvidia-cuda -o a_gpu.exe
>>> >
>>> > I got different errors:
>>> >
>>> > nvlink error   : Undefined reference to
>>> '_ZNKSt3__120__vector_base_commonILb1EE20__throw_length_errorEv' in
>>> '/tmp/main-42e0a6.cubin'
>>> > nvlink error   : Undefined reference to 'abort' in
>>> '/tmp/main-42e0a6.cubin'
>>> >
>>> > I think the reason here is nvlink don't know we should link libc++
>>> with those cubin files together. But I don't know how to solve this.
>>> >
>>> > Any one know a workaround to this?
>>> >
>>> > Thanks!
>>> >
>>> >
>>> >
>>> >
>>> > _______________________________________________
>>> > Openmp-dev mailing list
>>> > Openmp-dev at lists.llvm.org
>>> > https://lists.llvm.org/cgi-bin/mailman/listinfo/openmp-dev
>>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/openmp-dev/attachments/20200305/4ab8034c/attachment-0001.html>