[Openmp-dev] [EXTERNAL] OpenMP offload: Using global variable with a library result in "CUDA error: Loading global 'xxxx' Failed" or "nvlink error : Undefined reference to ..."

Huber, Joseph via Openmp-dev openmp-dev at lists.llvm.org
Wed Jan 5 06:00:20 PST 2022


Shared libraries work as long as offloading code isn't shared between the library and the application. When we do offloading linking we have a fat binary the contains code for the host and the device. With static linking we can extract the device code and use it to create a complete device image prior to linking. Shared libraries are loaded at runtime so we can't do the same approach and will need to leverage some kind of JIT to create a complete host binary. The work in the PR does not directly address the problem with shared libraries, but will make it easier to implement JIT functionality and potentially allow linking with shared libraries.

Thanks,
Joseph Huber
________________________________
From: pramod kumbhar <pramod.s.kumbhar at gmail.com>
Sent: Wednesday, January 5, 2022 1:20 AM
To: Huber, Joseph <huberjn at ornl.gov>; Ye Luo <xw111luoye at gmail.com>
Cc: openmp-dev at lists.llvm.org <openmp-dev at lists.llvm.org>
Subject: Re: [EXTERNAL] [Openmp-dev] OpenMP offload: Using global variable with a library result in "CUDA error: Loading global 'xxxx' Failed" or "nvlink error : Undefined reference to ..."

Thank you Ye and Joseph for clarifications!  I am glad that the issues is already being addressed!

Just would like to add one clarification - the issue/question #1 is addressed by switching to shared library but the issue/question #2 (i.e. extern global variables in offload region) remains with the shared library :

clang++ -fopenmp -fopenmp-targets=nvptx64-nvidia-cuda -g -O2 -fpic -shared test.cpp -o libtest.so
clang++ -fopenmp -fopenmp-targets=nvptx64-nvidia-cuda -g -O2 -o test3 main.cpp -L. -ltest -Wl,-rpath .
nvlink error   : Undefined reference to 'y' in '/gpfs/bbp.cscs.ch/ssd/slurmTmpFS/kumbhar/151397/main-1999e6.cubin<http://bbp.cscs.ch/ssd/slurmTmpFS/kumbhar/151397/main-1999e6.cubin>'
clang-13: error: nvlink command failed with exit code 255 (use -v to see invocation)
clang version 13.0.0
Target: x86_64-unknown-linux-gnu
Thread model: posix

I haven't built the PR D116541<https://reviews.llvm.org/D116541> locally yet but I assume the underlying issue is the same and will be addressed by the same PR?

Thank you again!

On Wed, Jan 5, 2022 at 3:06 AM Huber, Joseph <huberjn at ornl.gov<mailto:huberjn at ornl.gov>> wrote:
Kumbhar,

As ye said, the current support for linking static libraries is a little flaky. I'm currently working on a series of patches to hopefully address this problem that is under review at https://reviews.llvm.org/D116541 and its child revisions. I can currently link and run your tests using my development branch after adding some additional support to search the library paths for static libraries. This should solve your problem once these patches are ready to land upstream.

Thanks,
Joseph Huber
________________________________
From: Openmp-dev <openmp-dev-bounces at lists.llvm.org<mailto:openmp-dev-bounces at lists.llvm.org>> on behalf of pramod kumbhar via Openmp-dev <openmp-dev at lists.llvm.org<mailto:openmp-dev at lists.llvm.org>>
Sent: Tuesday, January 4, 2022 6:59 PM
To: openmp-dev at lists.llvm.org<mailto:openmp-dev at lists.llvm.org> <openmp-dev at lists.llvm.org<mailto:openmp-dev at lists.llvm.org>>
Subject: [EXTERNAL] [Openmp-dev] OpenMP offload: Using global variable with a library result in "CUDA error: Loading global 'xxxx' Failed" or "nvlink error : Undefined reference to ..."

Dear All,

I have two questions regarding the usage of global variables with OpenMP offload:

1. When I have global variable usage in the code from which I create a library then I get "CUDA error: Loading global 'x' Failed" error. Here is a simple reproducer showing the issue:

$ cat test.sh

CXX=clang++
CXXFLAGS="-fopenmp -fopenmp-targets=nvptx64-nvidia-cuda -g -O2"
${CXX} ${CXXFLAGS} -c test.cpp
ar cq libtest.a test.o
${CXX} ${CXXFLAGS} -o test1 main.cpp -L. -ltest
${CXX} ${CXXFLAGS} -o test2 main.cpp test.o

$ cat test.cpp

#pragma omp declare target
int y;
#pragma omp end declare target

int test() {
  y = 24;
  #pragma omp target update to(y)
  y = 42;

  int x;
  #pragma omp target map(from:x)
  {
    x = y;
  }
  return x;
}

$ cat main.cpp
extern int test();

int main() {
  return test();
}

Running the ./test2 works as expected as I am not using static library but the ./test1 fails with an error shown below:

$ ./test2
$ echo $?
24
$ ./test1
CUDA error: Loading global 'y' Failed
CUDA error: named symbol not found
Libomptarget error: Unable to generate entries table for device id 0.
Libomptarget error: Failed to init globals on device 0
Libomptarget error: Run with LIBOMPTARGET_INFO=4 to dump host-target pointer mappings.
test.cpp:7:3: Libomptarget fatal error 1: failure of target construct while offloading is mandatory
Aborted

Is this expected behaviour? Is there any workaround? I have tested this with PGI/NVHPC compiler and it works there.

2. The second scenario is similar but now I am trying to use a global variable from the library into the offload region in main.cpp i.e. modified main.cpp looks as:

$ cat main.cpp
extern int test();

#include <cstdio>

#pragma omp declare target
extern int y;
#pragma omp end declare target

int main() {
  #pragma omp target teams distribute parallel for
  for(int i=0; i<5; i++) {
    printf("--> %d \n", y + i);
  }
  return test();
}

This now fails to compile with a library:

+ clang++ -fopenmp -fopenmp-targets=nvptx64-nvidia-cuda -g -O2 -c test.cpp
+ ar cq libtest.a test.o
+ clang++ -fopenmp -fopenmp-targets=nvptx64-nvidia-cuda -g -O2 -o test1 main.cpp -L. -ltest
nvlink error   : Undefined reference to 'y' in '/gpfs/bbp.cscs.ch/ssd/slurmTmpFS/kumbhar/151151/main-9ea2c6.cubin<http://bbp.cscs.ch/ssd/slurmTmpFS/kumbhar/151151/main-9ea2c6.cubin>'
clang-13: error: nvlink command failed with exit code 255 (use -v to see invocation)
clang version 13.0.0
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /gpfs/bbp.cscs.ch/ssd/apps/bsd/2021-11/stage_externals/install_gcc-11.2.0-skylake/llvm-13.0.0-lvcrm6/bin<http://bbp.cscs.ch/ssd/apps/bsd/2021-11/stage_externals/install_gcc-11.2.0-skylake/llvm-13.0.0-lvcrm6/bin>

In a big application, we build & ship libraries and it's not easy/convenient to use objects for linking. Do you have any recommendations to solve/workaround issues for this use case?

Thank you very much!

Regards,
Pramod Kumbhar
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/openmp-dev/attachments/20220105/4e94848a/attachment.html>


More information about the Openmp-dev mailing list