[Openmp-dev] Target CUDA RTL --> The primary context is inactive, set its flags to CU_CTX_SCHED_BLOCKING_SYNC

Itaru Kitayama via Openmp-dev openmp-dev at lists.llvm.org
Mon Sep 28 19:55:06 PDT 2020


Ye,
Do you use Environment modules as your package manager? What env vars
do you set when building Clang and running the app?

On Tue, Sep 29, 2020 at 11:29 AM Ye Luo <xw111luoye at gmail.com> wrote:
>
> Still not clear what went wrong. I just installed clang 11 release on my local cluster with sm_35. No issues show up.
> Is this issue exposed from a complicated app or even just a simple "omp target" hangs the code? Are you able to run any CUDA program?
> The call stack indicates cuDevicePrimaryCtxRetain tries to interact with the driver but it doesn't respond and keeps the host side waiting.
> Still not clear if /usr/local/software/jureca/Stages/2019a/software/nvidia/driver/lib64/libcuda.so.1 is consistent with the running linux kernel driver.
> Could you try asking Julich to see if they have any clue about their settings on the machine.
> Ye
> ===================
> Ye Luo, Ph.D.
> Computational Science Division & Leadership Computing Facility
> Argonne National Laboratory
>
>
> On Mon, Sep 28, 2020 at 6:22 PM Itaru Kitayama <itaru.kitayama at gmail.com> wrote:
>>
>> $ which nvcc
>> /usr/local/software/jureca/Stages/2019a/software/CUDA/10.1.105/bin/nvcc
>> [kitayama1 at jrc0004 kitayama1]$ nvcc --version
>> nvcc: NVIDIA (R) Cuda compiler driver
>> Copyright (c) 2005-2019 NVIDIA Corporation
>> Built on Fri_Feb__8_19:08:17_PST_2019
>> Cuda compilation tools, release 10.1, V10.1.105
>> [kitayama1 at jrc0004 kitayama1]$ ldd
>> /p/project/cjzam11/kitayama1/opt/clang/current/lib/libomptarget.rtl.cuda.so
>> linux-vdso.so.1 =>  (0x00007ffc2a767000)
>> libcuda.so.1 =>
>> /usr/local/software/jureca/Stages/2019a/software/nvidia/driver/lib64/libcuda.so.1
>> (0x00002ac5418b9000)
>> libelf.so.1 => /usr/lib64/libelf.so.1 (0x00002ac542aa1000)
>> libc++.so.1 => /p/project/cjzam11/kitayama1/opt/clang/current/lib/libc++.so.1
>> (0x00002ac5416d7000)
>> libc++abi.so.1 =>
>> /p/project/cjzam11/kitayama1/opt/clang/current/lib/libc++abi.so.1
>> (0x00002ac5417a0000)
>> libm.so.6 => /usr/lib64/libm.so.6 (0x00002ac542cb9000)
>> libgcc_s.so.1 =>
>> /usr/local/software/jureca/Stages/2019a/software/GCCcore/8.3.0/lib64/libgcc_s.so.1
>> (0x00002ac5417e2000)
>> libc.so.6 => /usr/lib64/libc.so.6 (0x00002ac542fbb000)
>> libdl.so.2 => /usr/lib64/libdl.so.2 (0x00002ac543389000)
>> libpthread.so.0 => /usr/lib64/libpthread.so.0 (0x00002ac54358d000)
>> librt.so.1 => /usr/lib64/librt.so.1 (0x00002ac5437a9000)
>> libz.so.1 => /usr/local/software/jureca/Stages/2019a/software/zlib/1.2.11-GCCcore-8.3.0/lib/libz.so.1
>> (0x00002ac5417fd000)
>> /lib64/ld-linux-x86-64.so.2 (0x00002ac541695000)
>> libatomic.so.1 =>
>> /usr/local/software/jureca/Stages/2019a/software/GCCcore/8.3.0/lib64/libatomic.so.1
>> (0x00002ac541816000)
>> [kitayama1 at jrc0004 kitayama1]$ nvidia-smi
>> Tue Sep 29 01:21:23 2020
>> +-----------------------------------------------------------------------------+
>> | NVIDIA-SMI 440.33.01    Driver Version: 440.33.01    CUDA Version: 10.2     |
>> |-------------------------------+----------------------+----------------------+
>> | GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
>> | Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
>> |===============================+======================+======================|
>> |   0  Tesla K80           On   | 00000000:06:00.0 Off |                    0 |
>> | N/A   27C    P8    26W / 149W |      0MiB / 11441MiB |      0%      Default |
>> +-------------------------------+----------------------+----------------------+
>> |   1  Tesla K80           On   | 00000000:07:00.0 Off |                    0 |
>> | N/A   27C    P8    29W / 149W |      0MiB / 11441MiB |      0%      Default |
>> +-------------------------------+----------------------+----------------------+
>> |   2  Tesla K80           On   | 00000000:86:00.0 Off |                    0 |
>> | N/A   29C    P8    25W / 149W |      0MiB / 11441MiB |      0%      Default |
>> +-------------------------------+----------------------+----------------------+
>> |   3  Tesla K80           On   | 00000000:87:00.0 Off |                    0 |
>> | N/A   27C    P8    30W / 149W |      0MiB / 11441MiB |      0%      Default |
>> +-------------------------------+----------------------+----------------------+
>>
>> +-----------------------------------------------------------------------------+
>> | Processes:                                                       GPU Memory |
>> |  GPU       PID   Type   Process name                             Usage      |
>> |=============================================================================|
>> |  No running processes found                                                 |
>> +-----------------------------------------------------------------------------+
>>
>> On Tue, Sep 29, 2020 at 7:45 AM Ye Luo <xw111luoye at gmail.com> wrote:
>> >
>> > Could you provide
>> > `which nvcc`
>> > `nvcc --version`
>> > `ldd /p/project/cjzam11/kitayama1/opt/clang/current/lib/libomptarget.rtl.cuda.so`
>> > and nvidia-smi output?
>> > Ye
>> >
>> > ===================
>> > Ye Luo, Ph.D.
>> > Computational Science Division & Leadership Computing Facility
>> > Argonne National Laboratory
>> >
>> >
>> > On Mon, Sep 28, 2020 at 5:11 PM Itaru Kitayama via Openmp-dev <openmp-dev at lists.llvm.org> wrote:
>> >>
>> >> This happens an unpredictable way even though I launch the app the same way.
>> >>
>> >> On Mon, Sep 28, 2020 at 7:34 AM Itaru Kitayama <itaru.kitayama at gmail.com> wrote:
>> >> >
>> >> > No, I take that back. Here's the backtrace:
>> >> >
>> >> > (gdb) where
>> >> > #0  0x00002aaaaaacd6c2 in clock_gettime ()
>> >> > #1  0x00002aaaabd167fd in clock_gettime () from /usr/lib64/libc.so.6
>> >> > #2  0x00002aaaac97837e in ?? ()
>> >> >    from /usr/local/software/jureca/Stages/2019a/software/nvidia/driver/lib64/libcuda.so.1
>> >> > #3  0x00002aaaaca3c4f7 in ?? ()
>> >> >    from /usr/local/software/jureca/Stages/2019a/software/nvidia/driver/lib64/libcuda.so.1
>> >> > #4  0x00002aaaac87240a in ?? ()
>> >> >    from /usr/local/software/jureca/Stages/2019a/software/nvidia/driver/lib64/libcuda.so.1
>> >> > #5  0x00002aaaac91bfbe in ?? ()
>> >> >    from /usr/local/software/jureca/Stages/2019a/software/nvidia/driver/lib64/libcuda.so.1
>> >> > #6  0x00002aaaac91e0d7 in ?? ()
>> >> >    from /usr/local/software/jureca/Stages/2019a/software/nvidia/driver/lib64/libcuda.so.1
>> >> > #7  0x00002aaaac848719 in ?? ()
>> >> >    from /usr/local/software/jureca/Stages/2019a/software/nvidia/driver/lib64/libcuda.so.1
>> >> > #8  0x00002aaaac9ba15e in cuDevicePrimaryCtxRetain ()
>> >> >    from /usr/local/software/jureca/Stages/2019a/software/nvidia/driver/lib64/libcuda.so.1
>> >> > #9  0x00002aaaac514757 in __tgt_rtl_init_device ()
>> >> >    from /p/project/cjzam11/kitayama1/opt/clang/current/lib/libomptarget.rtl.cuda.so
>> >> > #10 0x00002aaaab9b88bb in DeviceTy::init() ()
>> >> >    from /p/project/cjzam11/kitayama1/opt/clang/current/lib/libomptarget.so
>> >> > #11 0x00002aaaac279348 in std::__1::__call_once(unsigned long
>> >> > volatile&, void*, void (*)(void*)) ()
>> >> >    from /p/project/cjzam11/kitayama1/opt/clang/current/lib/libc++.so.1
>> >> > #12 0x00002aaaab9b8d88 in device_is_ready(int) ()
>> >> >    from /p/project/cjzam11/kitayama1/opt/clang/current/lib/libomptarget.so
>> >> > #13 0x00002aaaab9c5296 in CheckDeviceAndCtors(long) ()
>> >> >    from /p/project/cjzam11/kitayama1/opt/clang/current/lib/libomptarget.so
>> >> > #14 0x00002aaaab9bbead in __tgt_target_data_begin_mapper ()
>> >> >    from /p/project/cjzam11/kitayama1/opt/clang/current/lib/libomptarget.so
>> >> > #15 0x00002aaaaabfaa58 in nest::SimulationManager::initialize() (this=0x5d3290)
>> >> >     at /p/project/cjzam11/kitayama1/projects/nest-simulator/nestkernel/simulation_manager.cpp:76
>> >> > #16 0x00002aaaaabf2c69 in nest::KernelManager::initialize() (this=0x5d3190)
>> >> >     at /p/project/cjzam11/kitayama1/projects/nest-simulator/nestkernel/kernel_manager.cpp:88
>> >> > #17 0x0000000000405769 in neststartup(int*, char***, SLIInterpreter&) (
>> >> >     argc=argc at entry=0x7fffffff0a84, argv=argv at entry=0x7fffffff0a88, engine=...)
>> >> >     at /p/project/cjzam11/kitayama1/projects/nest-simulator/nest/neststartup.cpp:87
>> >> > #18 0x0000000000405650 in main (argc=<optimized out>, argv=<optimized out>)
>> >> >     at /p/project/cjzam11/kitayama1/projects/nest-simulator/nest/main.cpp:42
>> >> >
>> >> > On Mon, Sep 28, 2020 at 5:22 AM Itaru Kitayama <itaru.kitayama at gmail.com> wrote:
>> >> > >
>> >> > > I obtained a desired result (a crash) without a Spack environment.
>> >> > >
>> >> > > On Sun, Sep 27, 2020 at 1:13 PM Itaru Kitayama <itaru.kitayama at gmail.com> wrote:
>> >> > > >
>> >> > > > (gdb) where
>> >> > > > #0  0x00002aaaaaacd6c2 in clock_gettime ()
>> >> > > > #1  0x00002aaaabd347fd in clock_gettime () from /usr/lib64/libc.so.6
>> >> > > > #2  0x00002aaaac98737e in ?? ()
>> >> > > >    from /usr/local/software/jureca/Stages/2019a/software/nvidia/driver/lib64/libcuda.so.1
>> >> > > > #3  0x00002aaaaca4b4f7 in ?? ()
>> >> > > >    from /usr/local/software/jureca/Stages/2019a/software/nvidia/driver/lib64/libcuda.so.1
>> >> > > > #4  0x00002aaaac88140a in ?? ()
>> >> > > >    from /usr/local/software/jureca/Stages/2019a/software/nvidia/driver/lib64/libcuda.so.1
>> >> > > > #5  0x00002aaaac92afbe in ?? ()
>> >> > > >    from /usr/local/software/jureca/Stages/2019a/software/nvidia/driver/lib64/libcuda.so.1
>> >> > > > #6  0x00002aaaac92d0d7 in ?? ()
>> >> > > >    from /usr/local/software/jureca/Stages/2019a/software/nvidia/driver/lib64/libcuda.so.1
>> >> > > > #7  0x00002aaaac857719 in ?? ()
>> >> > > >    from /usr/local/software/jureca/Stages/2019a/software/nvidia/driver/lib64/libcuda.so.1
>> >> > > > #8  0x00002aaaac9c915e in cuDevicePrimaryCtxRetain ()
>> >> > > >    from /usr/local/software/jureca/Stages/2019a/software/nvidia/driver/lib64/libcuda.so.1
>> >> > > > #9  0x00002aaaac523757 in __tgt_rtl_init_device ()
>> >> > > >    from /p/project/cjzam11/kitayama1/opt/clang/current/lib/libomptarget.rtl.cuda.so
>> >> > > > #10 0x00002aaaaaca28bb in DeviceTy::init() ()
>> >> > > >    from /p/project/cjzam11/kitayama1/opt/clang/current/lib/libomptarget.so
>> >> > > > #11 0x00002aaaac297348 in std::__1::__call_once(unsigned long
>> >> > > > volatile&, void*, void (*)(void*)) ()
>> >> > > >    from /p/project/cjzam11/kitayama1/opt/clang/current/lib/libc++.so.1
>> >> > > > #12 0x00002aaaaaca2d88 in device_is_ready(int) ()
>> >> > > >    from /p/project/cjzam11/kitayama1/opt/clang/current/lib/libomptarget.so
>> >> > > > #13 0x00002aaaaacaf296 in CheckDeviceAndCtors(long) ()
>> >> > > >    from /p/project/cjzam11/kitayama1/opt/clang/current/lib/libomptarget.so
>> >> > > > #14 0x00002aaaaaca5ead in __tgt_target_data_begin_mapper ()
>> >> > > >    from /p/project/cjzam11/kitayama1/opt/clang/current/lib/libomptarget.so
>> >> > > > #15 0x00002aaaab3a4958 in nest::SimulationManager::initialize() (this=0x5d3480)
>> >> > > >     at /p/project/cjzam11/kitayama1/projects/nest-simulator/nestkernel/simulation_manager.cpp:76
>> >> > > > #16 0x00002aaaab39cbb9 in nest::KernelManager::initialize() (this=0x5d3380)
>> >> > > >     at /p/project/cjzam11/kitayama1/projects/nest-simulator/nestkernel/kernel_manager.cpp:88
>> >> > > > #17 0x0000000000405769 in neststartup(int*, char***, SLIInterpreter&) (
>> >> > > >     argc=argc at entry=0x7ffffffee554, argv=argv at entry=0x7ffffffee558, engine=...)
>> >> > > >     at /p/project/cjzam11/kitayama1/projects/nest-simulator/nest/neststartup.cpp:87
>> >> > > > #18 0x0000000000405650 in main (argc=<optimized out>, argv=<optimized out>)
>> >> > > >     at /p/project/cjzam11/kitayama1/projects/nest-simulator/nest/main.cpp:42
>> >> > > >
>> >> > > > On Sun, Sep 27, 2020 at 12:55 PM Itaru Kitayama
>> >> > > > <itaru.kitayama at gmail.com> wrote:
>> >> > > > >
>> >> > > > >  and when this happens, no signal can get caught immediately by the system.
>> >> > > > >
>> >> > > > > On Sun, Sep 27, 2020 at 12:52 PM Itaru Kitayama
>> >> > > > > <itaru.kitayama at gmail.com> wrote:
>> >> > > > > >
>> >> > > > > > I see often when executing my work-in-the-progress offloading app on X86
>> >> > > > > > with an older NVIDIA GPU (sm_35). Can someone enlighten me on this so I
>> >> > > > > > can solve it quickly?
>> >> > > > > >
>> >> > > > > > Thanks,
>> >> _______________________________________________
>> >> Openmp-dev mailing list
>> >> Openmp-dev at lists.llvm.org
>> >> https://lists.llvm.org/cgi-bin/mailman/listinfo/openmp-dev


More information about the Openmp-dev mailing list