[PATCH] D140158: [CUDA] Allow targeting NVPTX directly without a host toolchain

Thu Dec 15 15:08:50 PST 2022

jhuber6 marked an inline comment as done.
jhuber6 added a comment.

I just realized the method of copying the `.o` to a `.cubin` doesn't work if the link step is done in the same compilation because it doesn't exist yet. To fix this I could either make the tool chain emit `.cubin` if we're going straight to `nvlink`, or use a symbolic link. The former is uglier, the latter will probably only work on Linux.

Also do you think I should include the CUDA headers in with this? We can always get rid of them with `nogpuinc` or similar if they're not needed. The AMDGPU compilation still links in the device libraries so I'm wondering if we should at least be consistent.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D140158/new/

https://reviews.llvm.org/D140158