[clang] Pass -offload-lto instead of -lto for cuda/hip kernels (PR #125243)

Mon Feb 3 06:31:53 PST 2025

================
@@ -498,12 +498,16 @@ Expected<StringRef> clang(ArrayRef<StringRef> InputFiles, const ArgList &Args) {
   };
 
   // Forward all of the `--offload-opt` and similar options to the device.
-  CmdArgs.push_back("-flto");
   for (auto &Arg : Args.filtered(OPT_offload_opt_eq_minus, OPT_mllvm))
     CmdArgs.append(
         {"-Xlinker",
          Args.MakeArgString("--plugin-opt=" + StringRef(Arg->getValue()))});
 
+  if (Triple.isNVPTX() || Triple.isAMDGPU())
+    CmdArgs.push_back("-foffload-lto");
+  else
+    CmdArgs.push_back("-flto");
----------------
jhuber6 wrote:

There actually was a recent issue @ye-luo had with the default passing of `-flto` breaking for x64 offloading. I could possibly see *only* passing `-flto` for the NVPTX and AMDGPU toolchains because in those cases we know they support LTO.

https://github.com/llvm/llvm-project/pull/125243