[llvm-dev] Compiling OpenMP for GPUs with there are function calls in the parallel regions
Alexey Bataev via llvm-dev
llvm-dev at lists.llvm.org
Sun Mar 13 21:16:16 PDT 2016
Hi,
Most probably you forget to enclose definition of 'foo()' function in 'pragma omp declare target' region. You can do it like this:
#pragma omp declare target
void foo() {
...
}
#pragma omp end declare target
Best regards,
Alexey Bataev
=============
Software Engineer
Intel Compiler Team
11.03.2016 21:10, Ahmed ElTantawy via llvm-dev пишет:
Hi,
Using the flow described here: https://parallel-computing.pro/index.php/9-cuda/43-openmp-4-0-on-nvidia-cuda-gpus, I can compile and run OpenMP code on GPUs when the parallel region is self-contained (i.e., does not include calls to functions).
When the parallel region includes a call to a function (e.g., foo()), I get this error.
nvlink error : Undefined reference to 'foo' in '/tmp/test.o-e8741d.cubin'
"foo" is indeed declared and defined in the same file before the main function, but clang driver does not include it in the final PTX file (test.s.tgt-nvptx64sm_30-nvidia-linux).
Using CUDA terminology, Is having "device functions" not supported yet in OpenMP ?
Thanks.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160314/011c22df/attachment.html>
More information about the llvm-dev
mailing list