[flang] [clang] [clang-tools-extra] [llvm] [compiler-rt] [libcxx] [libc] [lldb] [lld] [NVPTX] Add support for -march=native in standalone NVPTX (PR #79373)

Thu Jan 25 13:01:28 PST 2024

jhuber6 wrote:

> > This method of compilation is not like CUDA, so we can't target all the GPUs at the same time.
> 
> I think this is the key fact I was missing. If the patch is only for a standalone compilation which does not do multi-GPU compilation in principle, then your approach makes sense.
> 
> I was arguing from the normal offloading which does have ability to target multiple GPUs.

Yes, this is more similar to OpenCL or just regular CPU compilation where we have a single job that creates a simple executable, terminal application style. So given a single target, the desire is to "pick me the one that will work on the default CUDA device without me needing to check." type thing.

https://github.com/llvm/llvm-project/pull/79373