[clang] [HIP][AMDGPU] Use non-LTO pipeline for non-RDC in the linker wrapper (PR #201135)
Yaxun Liu via cfe-commits
cfe-commits at lists.llvm.org
Tue Jun 2 10:53:02 PDT 2026
================
@@ -548,6 +551,12 @@ Expected<StringRef> clang(ArrayRef<StringRef> InputFiles, const ArgList &Args,
if (!Triple.isNVPTX() && !Triple.isSPIRV())
CmdArgs.push_back("-Wl,--no-undefined");
+ // The device inputs are bitcode stored in files with an object extension.
+ // Force the IR input language so Clang runs the compile and backend phases
+ // instead of treating them as linker inputs, which would defer codegen to
+ // the LTO link and defeat the non-LTO pipeline.
+ if (NonLTOAMDGPU)
+ CmdArgs.append({"-x", "ir"});
----------------
yxsamliu wrote:
Good point on PGO. The profile runtime isn't `-mlink`'d, so I now keep LTO when `-fprofile-generate` is set — only plain non-RDC takes the non-LTO path, so profile generation still links and optimizes the runtime as before. This does highlight the real gap you mentioned: non-RDC non-LTO can't link device-side compiler-rt libraries properly, which is part of why the unified RDC/non-RDC interface in the FIXME would help.
https://github.com/llvm/llvm-project/pull/201135
More information about the cfe-commits
mailing list