[clang] [HIP][AMDGPU] Use non-LTO pipeline for non-RDC in the linker wrapper (PR #201135)

Yaxun Liu via cfe-commits cfe-commits at lists.llvm.org
Tue Jun 2 10:53:02 PDT 2026


================
@@ -548,6 +551,12 @@ Expected<StringRef> clang(ArrayRef<StringRef> InputFiles, const ArgList &Args,
   if (!Triple.isNVPTX() && !Triple.isSPIRV())
     CmdArgs.push_back("-Wl,--no-undefined");
 
+  // The device inputs are bitcode stored in files with an object extension.
+  // Force the IR input language so Clang runs the compile and backend phases
+  // instead of treating them as linker inputs, which would defer codegen to
+  // the LTO link and defeat the non-LTO pipeline.
+  if (NonLTOAMDGPU)
+    CmdArgs.append({"-x", "ir"});
----------------
yxsamliu wrote:

Good point on PGO. The profile runtime isn't `-mlink`'d, so I now keep LTO when `-fprofile-generate` is set — only plain non-RDC takes the non-LTO path, so profile generation still links and optimizes the runtime as before. This does highlight the real gap you mentioned: non-RDC non-LTO can't link device-side compiler-rt libraries properly, which is part of why the unified RDC/non-RDC interface in the FIXME would help.


https://github.com/llvm/llvm-project/pull/201135


More information about the cfe-commits mailing list