[clang] [llvm] [CGData][ThinLTO] Global Outlining with Two-CodeGen Rounds (PR #90933)

Mon Sep 30 09:36:45 PDT 2024

rlavaee wrote:

> > * Looking at the NFC, this seems like it has very similar issues to Propeller, which wants to redo just the codegen with a new injected profile and BB ordering. It would be good to see if we can converge to similar approaches. I asked @rlavaee to take a look and he is reading through the background on this work. @rlavaee do you think Propeller could use a similar approach to this where it saves the pre-codegen bitcode and re-loads it instead of redoing opt? This isn't necessarily an action item for this PR, but I wanted Rahman to take a look since he is more familiar with codegen.
> 
> It's interesting to know that Propeller wants to redo the codegen. I'm happy to align with this work. We've already started discussing this and have shared some details from our side. Here's the link for more info: https://discourse.llvm.org/t/rfc-enhanced-machine-outliner-part-2-thinlto-nolto/78753/11?u=kyulee-com.

Yes. Propeller's final post-link optimization can use the optimized cached bitcode from the profiled binary build. This can be an improvement for Propeller. @amharc did some experiments to measure the gain from such improvements. IIUC, we must use `-codegen-data-generate` and `-codegen-data-use` in the profiled and post-link build, respectively, whereas they are done in the same build here.

https://github.com/llvm/llvm-project/pull/90933