[llvm] 52e10e6 - [SYCL] Add clang-linker-wrapper changes to call clang-sycl-linker for SYCL offloads (#135683)

via llvm-commits llvm-commits at lists.llvm.org
Thu Apr 17 09:48:12 PDT 2025


Author: Arvind Sudarsanam
Date: 2025-04-17T16:48:08Z
New Revision: 52e10e6c3bad782380c8a931aabca2800b53a85d

URL: https://github.com/llvm/llvm-project/commit/52e10e6c3bad782380c8a931aabca2800b53a85d
DIFF: https://github.com/llvm/llvm-project/commit/52e10e6c3bad782380c8a931aabca2800b53a85d.diff

LOG: [SYCL] Add clang-linker-wrapper changes to call clang-sycl-linker for SYCL offloads (#135683)

This PR is one of the many PRs in the SYCL upstreaming effort focusing
on device code linking during the SYCL offload compilation process. RFC:
https://discourse.llvm.org/t/rfc-offloading-design-for-sycl-offload-kind-and-spir-targets/74088

Approved PRs so far:
1. [Clang][SYCL] Introduce clang-sycl-linker to link SYCL offloading
device code (Part 1 of many) -
[Link](https://github.com/llvm/llvm-project/pull/112245)
2. [clang-sycl-linker] Replace llvm-link with API calls -
[Link](https://github.com/llvm/llvm-project/pull/133797)
3. [SYCL][SPIR-V Backend][clang-sycl-linker] Add SPIR-V backend support
inside clang-sycl-linker -
[Link](https://github.com/llvm/llvm-project/pull/133967)

This PR adds SYCL device code linking support to clang-linker-wrapper.

### Summary for this PR

Device code linking happens inside clang-linker-wrapper. In the current
implementation, clang-linker-wrapper does the following:

1. Extracts device code. Input_1, Input_2,.....
5. Group device code according to target devices Inputs[triple_1] = ....
Inputs[triple_2] = ....
6. For each group, i.e. Inputs[triple_i], a. Gather all the offload
kinds found inside those inputs in ActiveOffloadKinds b. Link all images
inside Inputs[triple_i] by calling clang --target=triple_i .... c.
Create a copy of that linked image for each offload kind and add it to
Output[Kind] list.

In SYCL compilation flow, there is a deviation in Step 3b. We call
device code splitting inside the 'clang --target=triple_i ....' call and
the output is now a 'packaged' file containing multiple device images.
This deviation requires us to capture the OffloadKind during the linking
stage and pass it along to the linking function (clang), so that clang
can be called with a unique option '--sycl-link' that will help us to
call 'clang-sycl-linker' under the hood (clang-sycl-linker will do SYCL
specific linking).

Our current objective is to implement an end-to-end SYCL offloading flow
and get it working. We will eventually merge our approach with the
community flow.

Thanks

---------

Signed-off-by: Arvind Sudarsanam <arvind.sudarsanam at intel.com>

Added: 
    

Modified: 
    clang/docs/ClangOffloadPackager.rst
    clang/test/Driver/clang-sycl-linker-test.cpp
    clang/test/Driver/linker-wrapper.c
    clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp
    clang/tools/clang-sycl-linker/ClangSYCLLinker.cpp
    llvm/lib/Object/OffloadBinary.cpp

Removed: 
    


################################################################################
diff  --git a/clang/docs/ClangOffloadPackager.rst b/clang/docs/ClangOffloadPackager.rst
index 2b985e260e302..481069b5e4235 100644
--- a/clang/docs/ClangOffloadPackager.rst
+++ b/clang/docs/ClangOffloadPackager.rst
@@ -112,6 +112,8 @@ the following values for the :ref:`offload kind<table-offload_kind>` and the
     +------------+-------+---------------------------------------+
     | OFK_HIP    | 0x03  | The producer was HIP                  |
     +------------+-------+---------------------------------------+
+    | OFK_SYCL   | 0x04  | The producer was SYCL                 |
+    +------------+-------+---------------------------------------+
 
 The flags are used to signify certain conditions, such as the presence of
 debugging information or whether or not LTO was used. The string entry table is

diff  --git a/clang/test/Driver/clang-sycl-linker-test.cpp b/clang/test/Driver/clang-sycl-linker-test.cpp
index c399689653784..1dfba391fbc3e 100644
--- a/clang/test/Driver/clang-sycl-linker-test.cpp
+++ b/clang/test/Driver/clang-sycl-linker-test.cpp
@@ -8,7 +8,7 @@
 // RUN: clang-sycl-linker --dry-run -v -triple=spirv64 %t_1.bc %t_2.bc -o a.spv 2>&1 \
 // RUN:   | FileCheck %s --check-prefix=SIMPLE-FO
 // SIMPLE-FO: sycl-device-link: inputs: {{.*}}.bc, {{.*}}.bc  libfiles:  output: [[LLVMLINKOUT:.*]].bc
-// SIMPLE-FO-NEXT: SPIR-V Backend: input: [[LLVMLINKOUT]].bc, output: a.spv
+// SIMPLE-FO-NEXT: SPIR-V Backend: input: [[LLVMLINKOUT]].bc, output: a_0.spv
 //
 // Test the dry run of a simple case with device library files specified.
 // RUN: touch %T/lib1.bc
@@ -16,7 +16,7 @@
 // RUN: clang-sycl-linker --dry-run -v -triple=spirv64 %t_1.bc %t_2.bc --library-path=%T --device-libs=lib1.bc,lib2.bc -o a.spv 2>&1 \
 // RUN:   | FileCheck %s --check-prefix=DEVLIBS
 // DEVLIBS: sycl-device-link: inputs: {{.*}}.bc  libfiles: {{.*}}lib1.bc, {{.*}}lib2.bc  output: [[LLVMLINKOUT:.*]].bc
-// DEVLIBS-NEXT: SPIR-V Backend: input: [[LLVMLINKOUT]].bc, output: a.spv
+// DEVLIBS-NEXT: SPIR-V Backend: input: [[LLVMLINKOUT]].bc, output: a_0.spv
 //
 // Test a simple case with a random file (not bitcode) as input.
 // RUN: touch %t.o

diff  --git a/clang/test/Driver/linker-wrapper.c b/clang/test/Driver/linker-wrapper.c
index 0c77c2b34216a..a7e98e7351d98 100644
--- a/clang/test/Driver/linker-wrapper.c
+++ b/clang/test/Driver/linker-wrapper.c
@@ -2,6 +2,7 @@
 // REQUIRES: x86-registered-target
 // REQUIRES: nvptx-registered-target
 // REQUIRES: amdgpu-registered-target
+// REQUIRES: spirv-registered-target
 
 // An externally visible variable so static libraries extract.
 __attribute__((visibility("protected"), used)) int x;
@@ -9,6 +10,7 @@ __attribute__((visibility("protected"), used)) int x;
 // RUN: %clang -cc1 %s -triple x86_64-unknown-linux-gnu -emit-obj -o %t.elf.o
 // RUN: %clang -cc1 %s -triple nvptx64-nvidia-cuda -emit-llvm-bc -o %t.nvptx.bc
 // RUN: %clang -cc1 %s -triple amdgcn-amd-amdhsa -emit-llvm-bc -o %t.amdgpu.bc
+// RUN: %clang -cc1 %s -triple spirv64-unknown-unknown -emit-llvm-bc -o %t.spirv.bc
 
 // RUN: clang-offload-packager -o %t.out \
 // RUN:   --image=file=%t.elf.o,kind=openmp,triple=nvptx64-nvidia-cuda,arch=sm_70 \
@@ -49,6 +51,14 @@ __attribute__((visibility("protected"), used)) int x;
 
 // AMDGPU-LTO-TEMPS: clang{{.*}} --target=amdgcn-amd-amdhsa -mcpu=gfx1030 -flto {{.*}}-save-temps
 
+// RUN: clang-offload-packager -o %t.out \
+// RUN:   --image=file=%t.spirv.bc,kind=sycl,triple=spirv64-unknown-unknown,arch=generic
+// RUN: %clang -cc1 %s -triple x86_64-unknown-linux-gnu -emit-obj -o %t.o -fembed-offload-object=%t.out
+// RUN: not clang-linker-wrapper --host-triple=x86_64-unknown-linux-gnu --dry-run \
+// RUN:   --linker-path=/usr/bin/ld %t.o -o a.out 2>&1 | FileCheck %s --check-prefix=SPIRV-LINK
+
+// SPIRV-LINK: clang{{.*}} -o {{.*}}.img --target=spirv64-unknown-unknown {{.*}}.o --sycl-link -Xlinker -triple=spirv64-unknown-unknown -Xlinker -arch=
+
 // RUN: clang-offload-packager -o %t.out \
 // RUN:   --image=file=%t.elf.o,kind=openmp,triple=x86_64-unknown-linux-gnu \
 // RUN:   --image=file=%t.elf.o,kind=openmp,triple=x86_64-unknown-linux-gnu

diff  --git a/clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp b/clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp
index 082355e6c716f..4e716e818a232 100644
--- a/clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp
+++ b/clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp
@@ -464,7 +464,8 @@ fatbinary(ArrayRef<std::pair<StringRef, StringRef>> InputFiles,
 } // namespace amdgcn
 
 namespace generic {
-Expected<StringRef> clang(ArrayRef<StringRef> InputFiles, const ArgList &Args) {
+Expected<StringRef> clang(ArrayRef<StringRef> InputFiles, const ArgList &Args,
+                          uint16_t ActiveOffloadKindMask) {
   llvm::TimeTraceScope TimeScope("Clang");
   // Use `clang` to invoke the appropriate device tools.
   Expected<std::string> ClangPath =
@@ -554,6 +555,17 @@ Expected<StringRef> clang(ArrayRef<StringRef> InputFiles, const ArgList &Args) {
   if (Args.hasArg(OPT_embed_bitcode))
     CmdArgs.push_back("-Wl,--lto-emit-llvm");
 
+  // For linking device code with the SYCL offload kind, special handling is
+  // required. Passing --sycl-link to clang results in a call to
+  // clang-sycl-linker. Additional linker flags required by clang-sycl-linker
+  // will be communicated via the -Xlinker option.
+  if (ActiveOffloadKindMask & OFK_SYCL) {
+    CmdArgs.push_back("--sycl-link");
+    CmdArgs.append(
+        {"-Xlinker", Args.MakeArgString("-triple=" + Triple.getTriple())});
+    CmdArgs.append({"-Xlinker", Args.MakeArgString("-arch=" + Arch)});
+  }
+
   for (StringRef Arg : Args.getAllArgValues(OPT_linker_arg_EQ))
     CmdArgs.append({"-Xlinker", Args.MakeArgString(Arg)});
   for (StringRef Arg : Args.getAllArgValues(OPT_compiler_arg_EQ))
@@ -567,7 +579,8 @@ Expected<StringRef> clang(ArrayRef<StringRef> InputFiles, const ArgList &Args) {
 } // namespace generic
 
 Expected<StringRef> linkDevice(ArrayRef<StringRef> InputFiles,
-                               const ArgList &Args) {
+                               const ArgList &Args,
+                               uint16_t ActiveOffloadKindMask) {
   const llvm::Triple Triple(Args.getLastArgValue(OPT_triple_EQ));
   switch (Triple.getArch()) {
   case Triple::nvptx:
@@ -582,7 +595,7 @@ Expected<StringRef> linkDevice(ArrayRef<StringRef> InputFiles,
   case Triple::spirv64:
   case Triple::systemz:
   case Triple::loongarch64:
-    return generic::clang(InputFiles, Args);
+    return generic::clang(InputFiles, Args, ActiveOffloadKindMask);
   default:
     return createStringError(Triple.getArchName() +
                              " linking is not supported");
@@ -792,6 +805,7 @@ bundleLinkedOutput(ArrayRef<OffloadingImage> Images, const ArgList &Args,
   llvm::TimeTraceScope TimeScope("Bundle linked output");
   switch (Kind) {
   case OFK_OpenMP:
+  case OFK_SYCL:
     return bundleOpenMP(Images);
   case OFK_Cuda:
     return bundleCuda(Images, Args);
@@ -927,6 +941,14 @@ Expected<SmallVector<StringRef>> linkAndWrapDeviceFiles(
     for (const auto &File : Input)
       ActiveOffloadKindMask |= File.getBinary()->getOffloadKind();
 
+    // Linking images of SYCL offload kind with images of other kind is not
+    // supported.
+    // TODO: Remove the above limitation.
+    if ((ActiveOffloadKindMask & OFK_SYCL) &&
+        ((ActiveOffloadKindMask ^ OFK_SYCL) != 0))
+      return createStringError("Linking images of SYCL offload kind with "
+                               "images of any other kind is not supported");
+
     // Write any remaining device inputs to an output file.
     SmallVector<StringRef> InputFiles;
     for (const OffloadFile &File : Input) {
@@ -937,7 +959,8 @@ Expected<SmallVector<StringRef>> linkAndWrapDeviceFiles(
     }
 
     // Link the remaining device files using the device linker.
-    auto OutputOrErr = linkDevice(InputFiles, LinkerArgs);
+    auto OutputOrErr =
+        linkDevice(InputFiles, LinkerArgs, ActiveOffloadKindMask);
     if (!OutputOrErr)
       return OutputOrErr.takeError();
 

diff  --git a/clang/tools/clang-sycl-linker/ClangSYCLLinker.cpp b/clang/tools/clang-sycl-linker/ClangSYCLLinker.cpp
index c640deddc9e74..79747b0550417 100644
--- a/clang/tools/clang-sycl-linker/ClangSYCLLinker.cpp
+++ b/clang/tools/clang-sycl-linker/ClangSYCLLinker.cpp
@@ -70,6 +70,8 @@ static StringRef OutputFile;
 /// Directory to dump SPIR-V IR if requested by user.
 static SmallString<128> SPIRVDumpDir;
 
+using OffloadingImage = OffloadBinary::OffloadingImage;
+
 static void printVersion(raw_ostream &OS) {
   OS << clang::getClangToolFullVersion("clang-sycl-linker") << '\n';
 }
@@ -278,8 +280,8 @@ Expected<StringRef> linkDeviceCode(ArrayRef<std::string> InputFiles,
 /// Converts 'File' from LLVM bitcode to SPIR-V format using SPIR-V backend.
 /// 'Args' encompasses all arguments required for linking device code and will
 /// be parsed to generate options required to be passed into the backend.
-static Expected<StringRef> runSPIRVCodeGen(StringRef File, const ArgList &Args,
-                                           LLVMContext &C) {
+static Error runSPIRVCodeGen(StringRef File, const ArgList &Args,
+                             StringRef OutputFile, LLVMContext &C) {
   llvm::TimeTraceScope TimeScope("SPIR-V code generation");
 
   // Parse input module.
@@ -289,7 +291,7 @@ static Expected<StringRef> runSPIRVCodeGen(StringRef File, const ArgList &Args,
     return createStringError(Err.getMessage());
 
   if (Error Err = M->materializeAll())
-    return std::move(Err);
+    return Err;
 
   Triple TargetTriple(Args.getLastArgValue(OPT_triple_EQ));
   M->setTargetTriple(TargetTriple);
@@ -333,7 +335,7 @@ static Expected<StringRef> runSPIRVCodeGen(StringRef File, const ArgList &Args,
     errs() << formatv("SPIR-V Backend: input: {0}, output: {1}\n", File,
                       OutputFile);
 
-  return OutputFile;
+  return Error::success();
 }
 
 /// Performs the following steps:
@@ -349,10 +351,54 @@ Error runSYCLLink(ArrayRef<std::string> Files, const ArgList &Args) {
   if (!LinkedFile)
     reportError(LinkedFile.takeError());
 
+  // TODO: SYCL post link functionality involves device code splitting and will
+  // result in multiple bitcode codes.
+  // The following lines are placeholders to represent multiple files and will
+  // be refactored once SYCL post link support is available.
+  SmallVector<std::string> SplitModules;
+  SplitModules.emplace_back(*LinkedFile);
+
   // SPIR-V code generation step.
-  auto SPVFile = runSPIRVCodeGen(*LinkedFile, Args, C);
-  if (!SPVFile)
-    return SPVFile.takeError();
+  for (size_t I = 0, E = SplitModules.size(); I != E; ++I) {
+    auto Stem = OutputFile.rsplit('.').first;
+    std::string SPVFile(Stem);
+    SPVFile.append("_" + utostr(I) + ".spv");
+    auto Err = runSPIRVCodeGen(SplitModules[I], Args, SPVFile, C);
+    if (Err)
+      return std::move(Err);
+    SplitModules[I] = SPVFile;
+  }
+
+  // Write the final output into file.
+  int FD = -1;
+  if (std::error_code EC = sys::fs::openFileForWrite(OutputFile, FD))
+    return errorCodeToError(EC);
+  llvm::raw_fd_ostream FS(FD, /*shouldClose=*/true);
+
+  for (size_t I = 0, E = SplitModules.size(); I != E; ++I) {
+    auto File = SplitModules[I];
+    llvm::ErrorOr<std::unique_ptr<llvm::MemoryBuffer>> FileOrErr =
+        llvm::MemoryBuffer::getFileOrSTDIN(File);
+    if (std::error_code EC = FileOrErr.getError()) {
+      if (DryRun)
+        FileOrErr = MemoryBuffer::getMemBuffer("");
+      else
+        return createFileError(File, EC);
+    }
+    OffloadingImage TheImage{};
+    TheImage.TheImageKind = IMG_Object;
+    TheImage.TheOffloadKind = OFK_SYCL;
+    TheImage.StringData["triple"] =
+        Args.MakeArgString(Args.getLastArgValue(OPT_triple_EQ));
+    TheImage.StringData["arch"] =
+        Args.MakeArgString(Args.getLastArgValue(OPT_arch_EQ));
+    TheImage.Image = std::move(*FileOrErr);
+
+    llvm::SmallString<0> Buffer = OffloadBinary::write(TheImage);
+    if (Buffer.size() % OffloadBinary::getAlignment() != 0)
+      return createStringError("Offload binary has invalid size alignment");
+    FS << Buffer;
+  }
   return Error::success();
 }
 
@@ -394,7 +440,7 @@ int main(int argc, char **argv) {
   DryRun = Args.hasArg(OPT_dry_run);
   SaveTemps = Args.hasArg(OPT_save_temps);
 
-  OutputFile = "a.spv";
+  OutputFile = "a.out";
   if (Args.hasArg(OPT_o))
     OutputFile = Args.getLastArgValue(OPT_o);
 

diff  --git a/llvm/lib/Object/OffloadBinary.cpp b/llvm/lib/Object/OffloadBinary.cpp
index 56687c9acb653..3fff6b6a09e08 100644
--- a/llvm/lib/Object/OffloadBinary.cpp
+++ b/llvm/lib/Object/OffloadBinary.cpp
@@ -301,6 +301,7 @@ OffloadKind object::getOffloadKind(StringRef Name) {
       .Case("openmp", OFK_OpenMP)
       .Case("cuda", OFK_Cuda)
       .Case("hip", OFK_HIP)
+      .Case("sycl", OFK_SYCL)
       .Default(OFK_None);
 }
 
@@ -312,6 +313,8 @@ StringRef object::getOffloadKindName(OffloadKind Kind) {
     return "cuda";
   case OFK_HIP:
     return "hip";
+  case OFK_SYCL:
+    return "sycl";
   default:
     return "none";
   }


        


More information about the llvm-commits mailing list