[PATCH] D128090: [Clang][OpenMP] Process multi-arch compilation options given via -march

Wed Jul 13 11:36:26 PDT 2022

jhuber6 added a comment.

In D128090#3649059 <https://reviews.llvm.org/D128090#3649059>, @tra wrote:

> In D128090#3648999 <https://reviews.llvm.org/D128090#3648999>, @jhuber6 wrote:
>
>> Right now there's `CLANG_OPENMP_NVPTX_DEFAULT_ARCH`, which is defined by CMake to be the architecture of the system used to build clang
>
> That does not make sense to me. Most of the time clang would be built on a machine without a GPU, so I don't understand how one would derive a sensible value for CLANG_OPENMP_NVPTX_DEFAULT_ARCH there.
> The vast majority of users will use official release builds of clang and that has no conceivable way to give a sensible default for any specific user. Any guess would be OK sometimes, but it would be wrong most of the time.

It just defaults to `sm_35` if CUDA isn't present on the system IIRC. Alternatively we could ship a tool to derive it at compile time.

> I'm all for providing a sensible default, but there's no such thing when it comes for GPUs. CUDA falls back on the oldest supported GPU architecture which has the only benefit of working for occasional manual tinkering and is being consistently wrong for about everyone and forcing them to specify the actual offload targets relevant to their use case.
> So far it's the least bad and somewhat consistent approach I've seen.

Sometimes people get tricked into thinking it works by the JIT performed on the PTX output. There's an argument to be made that we shouldn't support any defaults at all, since architectures like AMDGPU provide no such mutual compatibility.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D128090/new/

https://reviews.llvm.org/D128090