[clang] [llvm] [CGData][ThinLTO] Global Outlining with Two-CodeGen Rounds (PR #90933)

Wed Sep 18 07:10:43 PDT 2024

================
@@ -1558,6 +1562,60 @@ class InProcessThinBackend : public ThinBackendProc {
     return BackendThreadPool.getMaxConcurrency();
   }
 };
+
+/// This Backend will run ThinBackend process but throw away all the output from
+/// the codegen. This class facilitates the first codegen round.
+class NoOutputThinBackend : public InProcessThinBackend {
----------------
kyulee-com wrote:

> Lastly just an idea: We could just hold the optimized bitcode in a similar buffer in memory, rather than writing them to disk and reading them again between rounds. Might run a bit faster.

That's a great point! I've been cautious about peak memory usage, especially with large app binaries. Since the linker already buffers the resulting object files, this isn't a new concern. It's worth noting that the buffer from the first round gets discarded right before the second round, so we effectively only hold a buffer for the resulting object files. As for the optimized bitcode files, which are usually much larger than object files, I chose to write them to disk instead of keeping everything in memory. For smaller app binaries, buffering the optimized bitcode could be beneficial, as you suggested. I think we can always revisit and potentially add this as an option if needed.

https://github.com/llvm/llvm-project/pull/90933