[llvm-dev] [GPUCC] link against libdevice

Tue Aug 2 16:27:31 PDT 2016

After r277542 clang should fix the problem:
* clang now picks correct libdevice version
* clang reports an error if required libdevice library is not found.

See https://reviews.llvm.org/D23037 for details.

--Artem

On Mon, Aug 1, 2016 at 2:06 AM, Mueller-Roemer, Johannes Sebastian via
llvm-dev <llvm-dev at lists.llvm.org> wrote:

> According to
> http://docs.nvidia.com/cuda/libdevice-users-guide/basic-usage.html#version-selection
> compute capabilities > 3.7 should use libdevice.compute_30.XX.bc
>
> -----Original Message-----
> From: llvm-dev [mailto:llvm-dev-bounces at lists.llvm.org] On Behalf Of
> Justin Lebar via llvm-dev
> Sent: Monday, August 1, 2016 08:33
> To: Yuanfeng Peng <yuanfeng at cis.upenn.edu>
> Cc: llvm-dev <llvm-dev at lists.llvm.org>
> Subject: Re: [llvm-dev] [GPUCC] link against libdevice
>
> OK, I see the problem.  You were right that we weren't picking up
> libdevice.
>
> CUDA 7.0 only ships with the following libdevice binaries (found
> /path/to/cuda/nvvm/libdevice):
>
>   libdevice.compute_20.10.bc  libdevice.compute_30.10.bc
> libdevice.compute_35.10.bc
>
> If you ask for sm_50 with cuda 7.0, clang can't find a matching libdevice
> binary, and it will apparently silently give up and try to continue
> compiling your program.  That's a bug that we should fix.
> (If you want the current behavior, you should have to ask clang not to use
> libdevice.)
>
> I see that nvcc from cuda 7.0 works (or at least builds without error).  I
> guess it uses the libdevice for compute_35.  We could do the same thing,
> although I am not sure how to tell whether that's safe in general.  I'll
> look into this as well.
>
> Anyway if you build with CUDA 7.5 your problem should go away, because
> CUDA 7.5 has a libdevice binary for compute_50.  Just pass
> --cuda-path=/path/to/cuda-7.5.  Alternatively you could continue building
> with cuda 7.0 and pass sm_35 as your gpu arch.  clang always embeds ptx in
> the binaries, so the result should still run on your
> sm_50 card (although your machine will have to jit the ptx on startup).
>
> As a third alternative, you could symlink your libdevice.compute_35.10.bc
> to libdevice.compute_50.10.bc, and...maybe that would work?  If you do
> that, please let me know how it goes, I am curious.  :)
>
> Thank you very much for the bug report!  If you like I'll cc you on any
> relevant changes, just create an account at https://reviews.llvm.org (if
> necessary; I can't seem to find you) and let me know your username.
>
> Regards,
> -Justin
>
> On Sun, Jul 31, 2016 at 10:59 PM, Yuanfeng Peng <yuanfeng at cis.upenn.edu>
> wrote:
> > Hi Justin,
> >
> > Thanks for your response!  The clang & llvm I'm using was built from
> source.
> >
> > Below is the output of compiling with -v.  Any suggestions would be
> > appreciated!
> >
> > clang version 3.9.0 (trunk 270145) (llvm/trunk 270133)
> > Target: x86_64-unknown-linux-gnu
> > Thread model: posix
> > InstalledDir: /usr/local/bin
> > Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.8
> > Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.8.4
> > Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.9
> > Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.9.3
> > Selected GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.8 Candidate
> > multilib: .;@m64 Candidate multilib: 32;@m32 Candidate multilib:
> > x32;@mx32 Selected multilib: .;@m64 Found CUDA installation:
> > /usr/local/cuda  "/usr/local/bin/clang-3.9" -cc1 -triple
> > nvptx64-nvidia-cuda -aux-triple x86_64-unknown-linux-gnu -S
> > -disable-free -main-file-name scalarProd.cu -mrelocation-model static
> > -mthread-model posix -mdisable-fp-elim -fmath-errno -no-integrated-as
> > -fcuda-is-device -target-cpu sm_50 -v -dwarf-column-info
> > -debugger-tuning=gdb -resource-dir
> > /usr/local/bin/../lib/clang/3.9.0 -I ../ -I
> > /usr/local/cuda-7.0/samples/common/inc -internal-isystem
> > /usr/lib/gcc/x86_64-linux-gnu/4.8/../../../../include/c++/4.8
> > -internal-isystem
> > /usr/lib/gcc/x86_64-linux-gnu/4.8/../../../../include/x86_64-linux-gnu
> > /c++/4.8
> > -internal-isystem
> > /usr/lib/gcc/x86_64-linux-gnu/4.8/../../../../include/x86_64-linux-gnu
> > /c++/4.8
> > -internal-isystem
> > /usr/lib/gcc/x86_64-linux-gnu/4.8/../../../../include/c++/4.8/backward
> > -internal-isystem /usr/local/include -internal-isystem
> > /usr/local/bin/../lib/clang/3.9.0/include -internal-externc-isystem
> > /include -internal-externc-isystem /usr/include -internal-isystem
> > /usr/lib/gcc/x86_64-linux-gnu/4.8/../../../../include/c++/4.8
> > -internal-isystem
> > /usr/lib/gcc/x86_64-linux-gnu/4.8/../../../../include/x86_64-linux-gnu
> > /c++/4.8
> > -internal-isystem
> > /usr/lib/gcc/x86_64-linux-gnu/4.8/../../../../include/x86_64-linux-gnu
> > /c++/4.8
> > -internal-isystem
> > /usr/lib/gcc/x86_64-linux-gnu/4.8/../../../../include/c++/4.8/backward
> > -internal-isystem /usr/local/cuda/include -include
> > __clang_cuda_runtime_wrapper.h -fdeprecated-macro
> > -fno-dwarf-directory-asm -fdebug-compilation-dir
> > /mnt/wtf/workspace/cuda/gpu-race-detection/cuda-compressed-conflict-de
> > tection/scalarProd -ferror-limit 19 -fmessage-length 144
> > -fobjc-runtime=gcc -fcxx-exceptions -fexceptions
> > -fdiagnostics-show-option -o /tmp/scalarProd-32a530.s -x cuda
> > scalarProd.cu hooklib.so loading.
> > clang -cc1 version 3.9.0 based upon LLVM 3.9.0svn default target
> > x86_64-unknown-linux-gnu ignoring nonexistent directory "/include"
> > ignoring duplicate directory
> >
> "/usr/lib/gcc/x86_64-linux-gnu/4.8/../../../../include/x86_64-linux-gnu/c++/4.8"
> > ignoring duplicate directory
> > "/usr/lib/gcc/x86_64-linux-gnu/4.8/../../../../include/c++/4.8"
> > ignoring duplicate directory
> >
> "/usr/lib/gcc/x86_64-linux-gnu/4.8/../../../../include/x86_64-linux-gnu/c++/4.8"
> > ignoring duplicate directory
> >
> "/usr/lib/gcc/x86_64-linux-gnu/4.8/../../../../include/x86_64-linux-gnu/c++/4.8"
> > ignoring duplicate directory
> > "/usr/lib/gcc/x86_64-linux-gnu/4.8/../../../../include/c++/4.8/backward"
> > ignoring duplicate directory "/usr/local/include"
> > ignoring duplicate directory "/usr/local/bin/../lib/clang/3.9.0/include"
> > ignoring duplicate directory "/usr/include"
> > #include "..." search starts here:
> > #include <...> search starts here:
> >  ..
> >  /usr/local/cuda-7.0/samples/common/inc
> >  /usr/lib/gcc/x86_64-linux-gnu/4.8/../../../../include/c++/4.8
> >
> > /usr/lib/gcc/x86_64-linux-gnu/4.8/../../../../include/x86_64-linux-gnu
> > /c++/4.8
> > /usr/lib/gcc/x86_64-linux-gnu/4.8/../../../../include/c++/4.8/backward
> >  /usr/local/include
> >  /usr/local/bin/../lib/clang/3.9.0/include
> >  /usr/include
> >  /usr/local/cuda/include
> > End of search list.
> >
> >  "/usr/local/cuda/bin/ptxas" -m64 -O0 --gpu-name sm_50 --output-file
> > /tmp/scalarProd-181f7e.o /tmp/scalarProd-32a530.s
> > ptxas fatal   : Unresolved extern function '__nv_mul24'
> > clang-3.9: error: ptxas command failed with exit code 255 (use -v to
> > see
> > invocation)
> >
> > Thanks!
> > Yuanfeng
> >
> > On Mon, Aug 1, 2016 at 1:04 AM, Justin Lebar <jlebar at google.com> wrote:
> >>
> >> Hi, Yuanfeng.
> >>
> >> What version of clang are you using?  CUDA is only known to work at
> >> tip of head, so you must build clang yourself from source.
> >>
> >> I suspect that's your problem, but if building from source doesn't
> >> fix it, please attach the output of compiling with -v.
> >>
> >> Regards,
> >> -Justin
> >>
> >> On Sun, Jul 31, 2016 at 9:24 PM, Chandler Carruth
> >> <chandlerc at google.com>
> >> wrote:
> >> > Directly CC-ing some folks who may be able to help.
> >> >
> >> > On Fri, Jul 29, 2016 at 6:27 AM Yuanfeng Peng via llvm-dev
> >> > <llvm-dev at lists.llvm.org> wrote:
> >> >>
> >> >> Hi,
> >> >>
> >> >> I was trying to compile scalarProd.cu (from CUDA SDK) with the
> >> >> following
> >> >> command:
> >> >>
> >> >>  clang++ -I../ -I/usr/local/cuda-7.0/samples/common/inc
> >> >> --cuda-gpu-arch=sm_50 scalarProd.cu
> >> >>
> >> >>  but ended up with the following error:
> >> >>
> >> >> ptxas fatal   : Unresolved extern function '__nv_mul24'
> >> >>
> >> >> Seems to me that libdevice was not automatically linked.  I wonder
> >> >> what flags I need to pass to clang to have the code linked against
> >> >> libdevice?
> >> >>
> >> >> Thanks!
> >> >> Yuanfeng Peng
> >> >> _______________________________________________
> >> >> LLVM Developers mailing list
> >> >> llvm-dev at lists.llvm.org
> >> >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> >
> >
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>

-- 
--Artem Belevich
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160802/e33ec8a1/attachment.html>