[llvm] [LTO] Fix a crash with thin LTO caching and asm output (PR #138203)

Wed May 14 04:17:09 PDT 2025

================
@@ -439,27 +439,33 @@ static void codegen(const Config &Conf, TargetMachine *TM,
   std::unique_ptr<CachedFileStream> &Stream = *StreamOrErr;
   TM->Options.ObjectFilenameForDebug = Stream->ObjectPathName;
 
-  legacy::PassManager CodeGenPasses;
-  TargetLibraryInfoImpl TLII(Mod.getTargetTriple());
-  CodeGenPasses.add(new TargetLibraryInfoWrapperPass(TLII));
-  // No need to make index available if the module is empty.
-  // In theory these passes should not use the index for an empty
-  // module, however, this guards against doing any unnecessary summary-based
-  // analysis in the case of a ThinLTO build where this might be an empty
-  // regular LTO combined module, with a large combined index from ThinLTO.
-  if (!isEmptyModule(Mod))
-    CodeGenPasses.add(
-        createImmutableModuleSummaryIndexWrapperPass(&CombinedIndex));
-  if (Conf.PreCodeGenPassesHook)
-    Conf.PreCodeGenPassesHook(CodeGenPasses);
-  if (TM->addPassesToEmitFile(CodeGenPasses, *Stream->OS,
-                              DwoOut ? &DwoOut->os() : nullptr,
-                              Conf.CGFileType))
-    report_fatal_error("Failed to setup codegen");
-  CodeGenPasses.run(Mod);
-
-  if (DwoOut)
-    DwoOut->keep();
+  // Create the LTO pipeline in its own scope so it gets deleted before
+  // Stream->commit() is called. The commit function of CacheFile deletes
----------------
anjenner wrote:

I would like to strongly advocate for leaving PR136121 in place and for merging this PR. My reasoning is as follows:
1. There is a general principle in C++ that destructors should never do anything except release resources - if they perform operations that can fail then those errors cannot be handled except out-of-band (tearing down the process). PR136121 improves the localCache API in this respect. While doing so has uncovered other instances of destructors doing non-release operations, that in itself is not a reason to abandon this engineering principle.
2. While most of the callers of the new commit() function currently still call report_fatal_error on the result (as I wanted to keep PR136121 as unimpactful as possible), they may wish to improve their error-handling in the future and PR136121 is a necessary component of that. The motivating case is an out-of-tree use https://github.com/ROCm/llvm-project/blob/amd-staging/amd/comgr/src/comgr-cache.cpp for which tearing down the process would be highly undesirable - if that was an unavoidable aspect of the localCache API then ROCm would need to have its own fork of the localCache API (that didn't do the commit in the destructor) with the maintenance headaches that would incur. It seems likely to me that other users of localCache will in the future realise that they have similar requirements.
3. Reverting PR136121 now would also mean that the other pieces of code that have had to be changed in response to PR136121 would have to be changed back - it would be more churn.
4. I stand by what I said about there being latent fragility in the codegen() function in llvm/lib/LTO/LTOBackend.cpp . The MCAsmStreamer object is destroyed (and hence the formatted_raw_ostream object) when CodeGenPasses is destroyed. If the CodeGenPasses constructor were in the future to be moved above the Stream constructor then it would be destroyed after the CachedFileStream (the order in which these destructors are run is the reverse of the order in which the corresponding constructors are run - the position of the addPassesToEmitFile() call is not relevant as the end of the lifetime of the MCAsmStreamer object is tied to the end of the lifetime of the legacy::PassManager object and hence the time when that object was constructed). So reverting PR136121 wouldn't actually solve the latent fragility that is the purported reason for reverting it.

https://github.com/llvm/llvm-project/pull/138203