[clang] [clang-linker-wrapper] Pass on --cuda-path to child clang processes (PR #149107)

Thu Jul 17 01:25:30 PDT 2025

ivanradanov wrote:

passing `-fopenmp --offload-arch=sm_80` above, so 
```
clang -fopenmp --offload-arch=sm_80 --verbose -foffload-via-llvm --cuda-path=/usr/local/cuda input.o  -o a.out
```

Gives us the appropriate flags. That means the cuda toolchain was created, correct?

I wonder if we need a step in clang that looks at all the .o files for sections that need device linking and concats the archs, and reinvokes itself with --offload-arch=<all_collected_arches> (although it is clang-linker-wrapper's job to do the parsing of the .o files for that so kind of weird to have clang do it) But then in theory the appropriate toolchains should be created. Perhaps it can only kick in when -foffload-via-llvm is on, but no --offload-archs are specified, i.e. we are asking clang to figure the appropriate offload archs.

That step could actually be handled by clang-offload-wrapper - you would get 

```
clang --offload-via-llvm <args>
  -> clang-linker-wrapper --detect-archs-and-exec=clang <args>
    -> clang --offload-via-llvm --offload-archs=<detected_archs> <args>
      -> clang-linker-wrapper (same as until now)
```

Pretty convoluted so I  don't know if it's appropriate

https://github.com/llvm/llvm-project/pull/149107