<div dir="ltr"><div>Could you provide<br></div><div>`which nvcc`</div><div>`nvcc --version`</div><div>`ldd /p/project/cjzam11/kitayama1/opt/clang/current/lib/<a href="http://libomptarget.rtl.cuda.so">libomptarget.rtl.cuda.so</a>`</div><div>and nvidia-smi output?</div><div>Ye</div><div><br></div><div><div><div><div dir="ltr" class="gmail_signature" data-smartmail="gmail_signature"><div dir="ltr"><div><div dir="ltr">===================<br>
Ye Luo, Ph.D.<br>Computational Science Division & Leadership Computing Facility<br>
Argonne National Laboratory</div></div></div></div></div><br></div></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Mon, Sep 28, 2020 at 5:11 PM Itaru Kitayama via Openmp-dev <<a href="mailto:openmp-dev@lists.llvm.org">openmp-dev@lists.llvm.org</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">This happens an unpredictable way even though I launch the app the same way.<br>
<br>
On Mon, Sep 28, 2020 at 7:34 AM Itaru Kitayama <<a href="mailto:itaru.kitayama@gmail.com" target="_blank">itaru.kitayama@gmail.com</a>> wrote:<br>
><br>
> No, I take that back. Here's the backtrace:<br>
><br>
> (gdb) where<br>
> #0  0x00002aaaaaacd6c2 in clock_gettime ()<br>
> #1  0x00002aaaabd167fd in clock_gettime () from /usr/lib64/libc.so.6<br>
> #2  0x00002aaaac97837e in ?? ()<br>
>    from /usr/local/software/jureca/Stages/2019a/software/nvidia/driver/lib64/libcuda.so.1<br>
> #3  0x00002aaaaca3c4f7 in ?? ()<br>
>    from /usr/local/software/jureca/Stages/2019a/software/nvidia/driver/lib64/libcuda.so.1<br>
> #4  0x00002aaaac87240a in ?? ()<br>
>    from /usr/local/software/jureca/Stages/2019a/software/nvidia/driver/lib64/libcuda.so.1<br>
> #5  0x00002aaaac91bfbe in ?? ()<br>
>    from /usr/local/software/jureca/Stages/2019a/software/nvidia/driver/lib64/libcuda.so.1<br>
> #6  0x00002aaaac91e0d7 in ?? ()<br>
>    from /usr/local/software/jureca/Stages/2019a/software/nvidia/driver/lib64/libcuda.so.1<br>
> #7  0x00002aaaac848719 in ?? ()<br>
>    from /usr/local/software/jureca/Stages/2019a/software/nvidia/driver/lib64/libcuda.so.1<br>
> #8  0x00002aaaac9ba15e in cuDevicePrimaryCtxRetain ()<br>
>    from /usr/local/software/jureca/Stages/2019a/software/nvidia/driver/lib64/libcuda.so.1<br>
> #9  0x00002aaaac514757 in __tgt_rtl_init_device ()<br>
>    from /p/project/cjzam11/kitayama1/opt/clang/current/lib/<a href="http://libomptarget.rtl.cuda.so" rel="noreferrer" target="_blank">libomptarget.rtl.cuda.so</a><br>
> #10 0x00002aaaab9b88bb in DeviceTy::init() ()<br>
>    from /p/project/cjzam11/kitayama1/opt/clang/current/lib/libomptarget.so<br>
> #11 0x00002aaaac279348 in std::__1::__call_once(unsigned long<br>
> volatile&, void*, void (*)(void*)) ()<br>
>    from /p/project/cjzam11/kitayama1/opt/clang/current/lib/libc++.so.1<br>
> #12 0x00002aaaab9b8d88 in device_is_ready(int) ()<br>
>    from /p/project/cjzam11/kitayama1/opt/clang/current/lib/libomptarget.so<br>
> #13 0x00002aaaab9c5296 in CheckDeviceAndCtors(long) ()<br>
>    from /p/project/cjzam11/kitayama1/opt/clang/current/lib/libomptarget.so<br>
> #14 0x00002aaaab9bbead in __tgt_target_data_begin_mapper ()<br>
>    from /p/project/cjzam11/kitayama1/opt/clang/current/lib/libomptarget.so<br>
> #15 0x00002aaaaabfaa58 in nest::SimulationManager::initialize() (this=0x5d3290)<br>
>     at /p/project/cjzam11/kitayama1/projects/nest-simulator/nestkernel/simulation_manager.cpp:76<br>
> #16 0x00002aaaaabf2c69 in nest::KernelManager::initialize() (this=0x5d3190)<br>
>     at /p/project/cjzam11/kitayama1/projects/nest-simulator/nestkernel/kernel_manager.cpp:88<br>
> #17 0x0000000000405769 in neststartup(int*, char***, SLIInterpreter&) (<br>
>     argc=argc@entry=0x7fffffff0a84, argv=argv@entry=0x7fffffff0a88, engine=...)<br>
>     at /p/project/cjzam11/kitayama1/projects/nest-simulator/nest/neststartup.cpp:87<br>
> #18 0x0000000000405650 in main (argc=<optimized out>, argv=<optimized out>)<br>
>     at /p/project/cjzam11/kitayama1/projects/nest-simulator/nest/main.cpp:42<br>
><br>
> On Mon, Sep 28, 2020 at 5:22 AM Itaru Kitayama <<a href="mailto:itaru.kitayama@gmail.com" target="_blank">itaru.kitayama@gmail.com</a>> wrote:<br>
> ><br>
> > I obtained a desired result (a crash) without a Spack environment.<br>
> ><br>
> > On Sun, Sep 27, 2020 at 1:13 PM Itaru Kitayama <<a href="mailto:itaru.kitayama@gmail.com" target="_blank">itaru.kitayama@gmail.com</a>> wrote:<br>
> > ><br>
> > > (gdb) where<br>
> > > #0  0x00002aaaaaacd6c2 in clock_gettime ()<br>
> > > #1  0x00002aaaabd347fd in clock_gettime () from /usr/lib64/libc.so.6<br>
> > > #2  0x00002aaaac98737e in ?? ()<br>
> > >    from /usr/local/software/jureca/Stages/2019a/software/nvidia/driver/lib64/libcuda.so.1<br>
> > > #3  0x00002aaaaca4b4f7 in ?? ()<br>
> > >    from /usr/local/software/jureca/Stages/2019a/software/nvidia/driver/lib64/libcuda.so.1<br>
> > > #4  0x00002aaaac88140a in ?? ()<br>
> > >    from /usr/local/software/jureca/Stages/2019a/software/nvidia/driver/lib64/libcuda.so.1<br>
> > > #5  0x00002aaaac92afbe in ?? ()<br>
> > >    from /usr/local/software/jureca/Stages/2019a/software/nvidia/driver/lib64/libcuda.so.1<br>
> > > #6  0x00002aaaac92d0d7 in ?? ()<br>
> > >    from /usr/local/software/jureca/Stages/2019a/software/nvidia/driver/lib64/libcuda.so.1<br>
> > > #7  0x00002aaaac857719 in ?? ()<br>
> > >    from /usr/local/software/jureca/Stages/2019a/software/nvidia/driver/lib64/libcuda.so.1<br>
> > > #8  0x00002aaaac9c915e in cuDevicePrimaryCtxRetain ()<br>
> > >    from /usr/local/software/jureca/Stages/2019a/software/nvidia/driver/lib64/libcuda.so.1<br>
> > > #9  0x00002aaaac523757 in __tgt_rtl_init_device ()<br>
> > >    from /p/project/cjzam11/kitayama1/opt/clang/current/lib/<a href="http://libomptarget.rtl.cuda.so" rel="noreferrer" target="_blank">libomptarget.rtl.cuda.so</a><br>
> > > #10 0x00002aaaaaca28bb in DeviceTy::init() ()<br>
> > >    from /p/project/cjzam11/kitayama1/opt/clang/current/lib/libomptarget.so<br>
> > > #11 0x00002aaaac297348 in std::__1::__call_once(unsigned long<br>
> > > volatile&, void*, void (*)(void*)) ()<br>
> > >    from /p/project/cjzam11/kitayama1/opt/clang/current/lib/libc++.so.1<br>
> > > #12 0x00002aaaaaca2d88 in device_is_ready(int) ()<br>
> > >    from /p/project/cjzam11/kitayama1/opt/clang/current/lib/libomptarget.so<br>
> > > #13 0x00002aaaaacaf296 in CheckDeviceAndCtors(long) ()<br>
> > >    from /p/project/cjzam11/kitayama1/opt/clang/current/lib/libomptarget.so<br>
> > > #14 0x00002aaaaaca5ead in __tgt_target_data_begin_mapper ()<br>
> > >    from /p/project/cjzam11/kitayama1/opt/clang/current/lib/libomptarget.so<br>
> > > #15 0x00002aaaab3a4958 in nest::SimulationManager::initialize() (this=0x5d3480)<br>
> > >     at /p/project/cjzam11/kitayama1/projects/nest-simulator/nestkernel/simulation_manager.cpp:76<br>
> > > #16 0x00002aaaab39cbb9 in nest::KernelManager::initialize() (this=0x5d3380)<br>
> > >     at /p/project/cjzam11/kitayama1/projects/nest-simulator/nestkernel/kernel_manager.cpp:88<br>
> > > #17 0x0000000000405769 in neststartup(int*, char***, SLIInterpreter&) (<br>
> > >     argc=argc@entry=0x7ffffffee554, argv=argv@entry=0x7ffffffee558, engine=...)<br>
> > >     at /p/project/cjzam11/kitayama1/projects/nest-simulator/nest/neststartup.cpp:87<br>
> > > #18 0x0000000000405650 in main (argc=<optimized out>, argv=<optimized out>)<br>
> > >     at /p/project/cjzam11/kitayama1/projects/nest-simulator/nest/main.cpp:42<br>
> > ><br>
> > > On Sun, Sep 27, 2020 at 12:55 PM Itaru Kitayama<br>
> > > <<a href="mailto:itaru.kitayama@gmail.com" target="_blank">itaru.kitayama@gmail.com</a>> wrote:<br>
> > > ><br>
> > > >  and when this happens, no signal can get caught immediately by the system.<br>
> > > ><br>
> > > > On Sun, Sep 27, 2020 at 12:52 PM Itaru Kitayama<br>
> > > > <<a href="mailto:itaru.kitayama@gmail.com" target="_blank">itaru.kitayama@gmail.com</a>> wrote:<br>
> > > > ><br>
> > > > > I see often when executing my work-in-the-progress offloading app on X86<br>
> > > > > with an older NVIDIA GPU (sm_35). Can someone enlighten me on this so I<br>
> > > > > can solve it quickly?<br>
> > > > ><br>
> > > > > Thanks,<br>
_______________________________________________<br>
Openmp-dev mailing list<br>
<a href="mailto:Openmp-dev@lists.llvm.org" target="_blank">Openmp-dev@lists.llvm.org</a><br>
<a href="https://lists.llvm.org/cgi-bin/mailman/listinfo/openmp-dev" rel="noreferrer" target="_blank">https://lists.llvm.org/cgi-bin/mailman/listinfo/openmp-dev</a><br>
</blockquote></div>