[clang] [CUDA] Add device-side kernel launch support (PR #165519)

Fri Nov 21 14:11:27 PST 2025

================
@@ -525,6 +564,11 @@ Expected<SmallVector<StringRef>> getInput(const ArgList &Args) {
           object::Archive::create(Buffer);
       if (!LibFile)
         return LibFile.takeError();
+      // Skip extracting archives with fat binaries. Forward them to nvlink.
+      if (hasFatBinary(**LibFile)) {
+        ForwardArchives.emplace_back(Args.MakeArgString(*Filename));
----------------
darkbuck wrote:

> Hm, it's probably not worth mixing handling for these CUDA fat binary archives. The tough part is that we kind of just want to forward these without modification while the rest of the handling is supposed to do the normal linker stuff that nvlink doesn't do. I'd assume all we'd need is something like this. Do we need additional handling?
> 
> ```c
> // Just let nvlink handle these directly.
> if (hasFatBinary(Archive))
>   Files.emplace_back(Archive);
> ```

@jhuber6 if we do it this way, we put all fatbin archives ahead of other object files. For example, `obj.o -lcudadevrt` will be translated into
`<path>/cudadevrt.a obj.o`. `nvlink` reports undefined symbol under that order. How about we all fatbin archives at the end?

https://github.com/llvm/llvm-project/pull/165519