[PATCH] D69990: Populate CUDA flags on FreeBSD too, as many other toolchains do.
Dimitry Andric via Phabricator via cfe-commits
cfe-commits at lists.llvm.org
Mon Nov 18 12:57:49 PST 2019
dim added a comment.
In D69990#1750348 <https://reviews.llvm.org/D69990#1750348>, @tra wrote:
> LGTM, though I'm curious if it's particularly useful. Last time I checked NVIDIA didn't ship libcudart for FreeBSD and without it it's rather cumbersome to use CUDA in practice.
After extracting the necessary CUDA stuff and enabling Linux emulation (for `ptxas`), at least a "hello world" sample program compiles to an object file:
$ ~/obj/llvm/llvmorg-10-init-10100-g9a5b7b785bf-freebsd13-amd64-ninja-rel-1/bin/clang --cuda-path=/share/dim/src/freebsd/cuda/cuda-10.1 --cuda-gpu-arch=sm_60 -c hello.cu -v
clang version 10.0.0 (https://github.com/llvm/llvm-project.git 014799db369c8e30c222c0e9d3ea143f349c3db9)
Target: x86_64-unknown-freebsd13.0
Thread model: posix
InstalledDir: /home/dim/obj/llvm/llvmorg-10-init-10100-g9a5b7b785bf-freebsd13-amd64-ninja-rel-1/bin
Found CUDA installation: /share/dim/src/freebsd/cuda/cuda-10.1, version 10.1
"/home/dim/obj/llvm/llvmorg-10-init-10100-g9a5b7b785bf-freebsd13-amd64-ninja-rel-1/bin/clang" -cc1 -triple nvptx64-nvidia-cuda -aux-triple x86_64-unknown-freebsd13.0 -S -disable-free -main-file-name hello.cu -mrelocation-model static -mthread-model posix -mframe-pointer=all -fno-rounding-math -no-integrated-as -fuse-init-array -fcuda-is-device -mlink-builtin-bitcode /share/dim/src/freebsd/cuda/cuda-10.1/nvvm/libdevice/libdevice.10.bc -target-feature +ptx64 -target-sdk-version=10.1 -target-cpu sm_60 -dwarf-column-info -debugger-tuning=gdb -v -resource-dir /home/dim/obj/llvm/llvmorg-10-init-10100-g9a5b7b785bf-freebsd13-amd64-ninja-rel-1/lib/clang/10.0.0 -internal-isystem /home/dim/obj/llvm/llvmorg-10-init-10100-g9a5b7b785bf-freebsd13-amd64-ninja-rel-1/lib/clang/10.0.0/include/cuda_wrappers -internal-isystem /share/dim/src/freebsd/cuda/cuda-10.1/include -include __clang_cuda_runtime_wrapper.h -internal-isystem /usr/include/c++/v1 -internal-isystem /usr/include/c++/v1 -fdeprecated-macro -fno-dwarf-directory-asm -fno-autolink -fdebug-compilation-dir /tmp -ferror-limit 19 -fmessage-length 160 -fgnuc-version=4.2.1 -fobjc-runtime=gcc -fcxx-exceptions -fexceptions -fdiagnostics-show-option -fcolor-diagnostics -o /home/dim/tmp/hello-f032c8.s -x cuda hello.cu
clang -cc1 version 10.0.0 based upon LLVM 10.0.0git default target x86_64-unknown-freebsd13.0
ignoring duplicate directory "/usr/include/c++/v1"
#include "..." search starts here:
#include <...> search starts here:
/home/dim/obj/llvm/llvmorg-10-init-10100-g9a5b7b785bf-freebsd13-amd64-ninja-rel-1/lib/clang/10.0.0/include/cuda_wrappers
/share/dim/src/freebsd/cuda/cuda-10.1/include
/usr/include/c++/v1
/home/dim/obj/llvm/llvmorg-10-init-10100-g9a5b7b785bf-freebsd13-amd64-ninja-rel-1/lib/clang/10.0.0/include
/usr/include
End of search list.
"/share/dim/src/freebsd/cuda/cuda-10.1/bin/ptxas" -m64 -O0 -v --gpu-name sm_60 --output-file /home/dim/tmp/hello-54422a.o /home/dim/tmp/hello-f032c8.s
ptxas info : 23 bytes gmem
ptxas info : Compiling entry function '_Z10cuda_hellov' for 'sm_60'
ptxas info : Function properties for _Z10cuda_hellov
0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
ptxas info : Used 8 registers, 320 bytes cmem[0]
"/share/dim/src/freebsd/cuda/cuda-10.1/bin/fatbinary" -64 --create /home/dim/tmp/hello-9cd109.fatbin --image=profile=sm_60,file=/home/dim/tmp/hello-54422a.o --image=profile=compute_60,file=/home/dim/tmp/hello-f032c8.s
"/home/dim/obj/llvm/llvmorg-10-init-10100-g9a5b7b785bf-freebsd13-amd64-ninja-rel-1/bin/clang" -cc1 -triple x86_64-unknown-freebsd13.0 -target-sdk-version=10.1 -aux-triple nvptx64-nvidia-cuda -emit-obj -mrelax-all -disable-free -main-file-name hello.cu -mrelocation-model static -mthread-model posix -mframe-pointer=all -fno-rounding-math -masm-verbose -mconstructor-aliases -munwind-tables -fuse-init-array -target-cpu x86-64 -dwarf-column-info -debugger-tuning=gdb -v -resource-dir /home/dim/obj/llvm/llvmorg-10-init-10100-g9a5b7b785bf-freebsd13-amd64-ninja-rel-1/lib/clang/10.0.0 -internal-isystem /home/dim/obj/llvm/llvmorg-10-init-10100-g9a5b7b785bf-freebsd13-amd64-ninja-rel-1/lib/clang/10.0.0/include/cuda_wrappers -internal-isystem /share/dim/src/freebsd/cuda/cuda-10.1/include -include __clang_cuda_runtime_wrapper.h -internal-isystem /usr/include/c++/v1 -internal-isystem /usr/include/c++/v1 -fdeprecated-macro -fdebug-compilation-dir /tmp -ferror-limit 19 -fmessage-length 160 -fgnuc-version=4.2.1 -fobjc-runtime=gnustep -fcxx-exceptions -fexceptions -fdiagnostics-show-option -fcolor-diagnostics -fcuda-include-gpubinary /home/dim/tmp/hello-9cd109.fatbin -faddrsig -o hello.o -x cuda hello.cu
clang -cc1 version 10.0.0 based upon LLVM 10.0.0git default target x86_64-unknown-freebsd13.0
ignoring duplicate directory "/usr/include/c++/v1"
#include "..." search starts here:
#include <...> search starts here:
/home/dim/obj/llvm/llvmorg-10-init-10100-g9a5b7b785bf-freebsd13-amd64-ninja-rel-1/lib/clang/10.0.0/include/cuda_wrappers
/share/dim/src/freebsd/cuda/cuda-10.1/include
/usr/include/c++/v1
/home/dim/obj/llvm/llvmorg-10-init-10100-g9a5b7b785bf-freebsd13-amd64-ninja-rel-1/lib/clang/10.0.0/include
/usr/include
End of search list.
I can't link it into an executable yet, though. That's probably going to need some added link flags.
> You can compile a kernel, but kernel loading, launching, and related data transfers will all need to be done via driver API. It should be possible to implement a functional replacement, but I'm not aware of any existing open-source implementations. I'm also not sure if clang will be able to deal with CUDA headers correctly on FreeBSD as CUDA headers do sometimes seem to rely on implementation specifics of Linux headers.
I think @6yearold is at least experimenting with this. One step at a time... :)
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D69990/new/
https://reviews.llvm.org/D69990
More information about the cfe-commits
mailing list