[PATCH] D140158: [CUDA] Allow targeting NVPTX directly without a host toolchain

Joseph Huber via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Thu Dec 15 15:34:58 PST 2022


jhuber6 added a comment.

In D140158#3999783 <https://reviews.llvm.org/D140158#3999783>, @tra wrote:

> In D140158#3999716 <https://reviews.llvm.org/D140158#3999716>, @jhuber6 wrote:
>
>> I just realized the method of copying the `.o` to a `.cubin` doesn't work if the link step is done in the same compilation because it doesn't exist yet. To fix this I could either make the tool chain emit `.cubin` if we're going straight to `nvlink`, or use a symbolic link. The former is uglier, the latter will probably only work on Linux.
>
> Does it have to be `.cubin` ? Does nvlink require it?

Yes, it's one of the dumbest things in the CUDA toolchain. If the filename is a `.o` they assume it's a host file containing embedded RDC-code that needs to be combined. If it's a `.cubin` they assume it's device code. I have no clue why they couldn't just check the ELF flags, it's trivial.

>> Also do you think I should include the CUDA headers in with this? We can always get rid of them with `nogpuinc` or similar if they're not needed. The AMDGPU compilation still links in the device libraries so I'm wondering if we should at least be consistent.
>
> Only some CUDA headers can be used from C++ and they tend to contain host-only declarations, and stubs or CPU implementations of GPU-side functions in that case.
> The rest rely on CUDA extensions, attributes, So, in this case doing a C++ compilation will give you a host-side view of the headers in the best case, which is not going to be particularly useful.
>
> If we want to make GPU-side functions available, we'll need to have a pre-included wrapper with more preprocessor magic to make CUDA headers usable.  Doing that in a C++ compilation would be questionable as for a C++ compilation a user would normally assume that no headers are pre-included by the compiler.

We might want to at least include the `libdevice` files, most of our wrappers definitely won't work without CUDA or OpenMP language modes.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D140158/new/

https://reviews.llvm.org/D140158



More information about the cfe-commits mailing list