[clang] [Clang] Permit `-Xarch_` to be used with `--offload-arch` (PR #131884)

Tue Mar 18 11:58:13 PDT 2025

https://github.com/jhuber6 created https://github.com/llvm/llvm-project/pull/131884

Summary:
The `--offload-arch` option is very complicated, but roughly behaves as
the `-march` option for several compilations at once. This creates
problems when we try to compbine multiple separate architectures into
one, as happens with SYCL, OpenMP, and HIP w/ SPIR-V.

The existing solution used by OpenMP is the `-Xopenmp-target` option,
this lets you select which `--offload-arch` options go to which
toolchain. This patch premits `-Xarch_` to be used in the same way.

There are concerns about whether or not this falls into the `-Xarch_`
umbrella because it changes the driver behavior, but I think this is the
easiest way to handle this problem. The existing solutions seems to be
prefixing things and adding more magic handling into `--offload-arch`.
Like SPIRV is doing `nvidia_gpu_sm_89` instead of just `-Xarch_nvptx64
--offload-arch=sm_89`.

The only reason this is more complicated than just doing `-Xarch_sm_89
-march=...` is because we need to know to create multiple jobs for each
architecture.


>From 2991e0038881b144f3874855ee007534c9c7c313 Mon Sep 17 00:00:00 2001
From: Joseph Huber <huberjn at outlook.com>
Date: Tue, 18 Mar 2025 13:49:29 -0500
Subject: [PATCH] [Clang] Permit `-Xarch_` to be used with `--offload-arch`

Summary:
The `--offload-arch` option is very complicated, but roughly behaves as
the `-march` option for several compilations at once. This creates
problems when we try to compbine multiple separate architectures into
one, as happens with SYCL, OpenMP, and HIP w/ SPIR-V.

The existing solution used by OpenMP is the `-Xopenmp-target` option,
this lets you select which `--offload-arch` options go to which
toolchain. This patch premits `-Xarch_` to be used in the same way.

There are concerns about whether or not this falls into the `-Xarch_`
umbrella because it changes the driver behavior, but I think this is the
easiest way to handle this problem. The existing solutions seems to be
prefixing things and adding more magic handling into `--offload-arch`.
Like SPIRV is doing `nvidia_gpu_sm_89` instead of just `-Xarch_nvptx64
--offload-arch=sm_89`.

The only reason this is more complicated than just doing `-Xarch_sm_89
-march=...` is because we need to know to create multiple jobs for each
architecture.
---
 clang/include/clang/Driver/Options.td | 3 +--
 clang/test/Driver/offload-Xarch.c     | 4 ++++
 2 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td
index 66ae8f1c7f064..05fc6aaa266b5 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -1129,13 +1129,12 @@ def fno_convergent_functions : Flag<["-"], "fno-convergent-functions">,
 // Common offloading options
 let Group = offload_Group in {
 def offload_arch_EQ : Joined<["--"], "offload-arch=">,
-  Visibility<[ClangOption, FlangOption]>, Flags<[NoXarchOption]>,
+  Visibility<[ClangOption, FlangOption]>,
   HelpText<"Specify an offloading device architecture for CUDA, HIP, or OpenMP. (e.g. sm_35). "
            "If 'native' is used the compiler will detect locally installed architectures. "
            "For HIP offloading, the device architecture can be followed by target ID features "
            "delimited by a colon (e.g. gfx908:xnack+:sramecc-). May be specified more than once.">;
 def no_offload_arch_EQ : Joined<["--"], "no-offload-arch=">,
-  Flags<[NoXarchOption]>,
   Visibility<[ClangOption, FlangOption]>,
   HelpText<"Remove CUDA/HIP offloading device architecture (e.g. sm_35, gfx906) from the list of devices to compile for. "
            "'all' resets the list to its default value.">;
diff --git a/clang/test/Driver/offload-Xarch.c b/clang/test/Driver/offload-Xarch.c
index 8856dac198465..8106dcfcd1354 100644
--- a/clang/test/Driver/offload-Xarch.c
+++ b/clang/test/Driver/offload-Xarch.c
@@ -14,6 +14,10 @@
 // RUN:   --target=x86_64-unknown-linux-gnu -Xopenmp-target=nvptx64-nvidia-cuda --offload-arch=sm_52,sm_60 -nogpuinc \
 // RUN:   -Xopenmp-target=amdgcn-amd-amdhsa --offload-arch=gfx90a,gfx1030 -ccc-print-bindings -### %s 2>&1 \
 // RUN: | FileCheck -check-prefix=OPENMP %s
+// RUN: %clang -fopenmp=libomp -fopenmp-targets=nvptx64-nvidia-cuda,amdgcn-amd-amdhsa -nogpulib \
+// RUN:   --target=x86_64-unknown-linux-gnu -Xarch_nvptx64 --offload-arch=sm_52,sm_60 -nogpuinc \
+// RUN:   -Xarch_amdgcn --offload-arch=gfx90a,gfx1030 -ccc-print-bindings -### %s 2>&1 \
+// RUN: | FileCheck -check-prefix=OPENMP %s
 
 // OPENMP: # "x86_64-unknown-linux-gnu" - "clang", inputs: ["[[INPUT:.+]]"], output: "[[HOST_BC:.+]]"
 // OPENMP: # "amdgcn-amd-amdhsa" - "clang", inputs: ["[[INPUT]]", "[[HOST_BC]]"], output: "[[GFX1030_BC:.+]]"