[libcxx-commits] [clang] [lld] [libcxx] [flang] [compiler-rt] [libc] [clang-tools-extra] [llvm] [lldb] [NVPTX] Add support for -march=native in standalone NVPTX (PR #79373)

Thu Jan 25 11:48:13 PST 2024

jhuber6 wrote:

> User confusion is only part of the issue here. With any single GPU choice we would still potentially produce a nonworking binary, if our GPU choice does not match what the user wants.
>
> "all GPUs" has the advantage of always producing the binary that's guaranteed to work. Granted, in the case of multiple GPUs it comes with the compilation time overhead, but I think it's a better trade-off than compiling faster, but not working. If the overhead is unacceptable, then we can tweak the build, but in that case, the user may as well just specify the desired architectures explicitly.

I think the semantics of `native` on other architectures are clear enough here. This combined with the fact that using `-march=native` will error out in the case of no GPUs available, or give a warning if more than one GPU is available, should be sufficiently clear what it's doing. This obviously falls apart if you compile with `-march=native` and then move it off of the system you compiled it for, but the same applies for standard x64 binaries I feel.

Realistically, very, very few casual users are going to be using direct NVPTX targeting. The current use-case is for building tests directly for the GPU without needing to handle calling `amdgpu-arch` and `nvptx-arch` manually in CMake. If I had this in, then I could simplify a lot of CMake code in my `libc` project by just letting the compiler handle the autodetection. Then one less random program dependency is removed from the build process. AMDGPU already has `-mcpu=native` so I'd like NVPTX to match if possible.

https://github.com/llvm/llvm-project/pull/79373