[llvm-dev] Compiling OpenMP for GPUs with there are function calls in the parallel regions

Fri Mar 11 10:10:18 PST 2016

Hi,

Using the flow described here:
https://parallel-computing.pro/index.php/9-cuda/43-openmp-4-0-on-nvidia-cuda-gpus,
I can compile and run OpenMP code on GPUs when the parallel region is
self-contained (i.e., does not include calls to functions).

When the parallel region includes a call to a function (e.g., foo()), I get
this error.

nvlink error   : Undefined reference to 'foo' in '/tmp/test.o-e8741d.cubin'

"foo" is indeed declared and defined in the same file before the main
function, but clang driver does not include it in the final PTX file
(test.s.tgt-nvptx64sm_30-nvidia-linux).

Using CUDA terminology, Is having "device functions" not supported yet in
OpenMP ?

Thanks.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160311/70930a0f/attachment.html>