[llvm-branch-commits] AlwaysInliner: A new inlining algorithm to interleave alloca promotion with inlines. (PR #145613)
Mircea Trofin via llvm-branch-commits
llvm-branch-commits at lists.llvm.org
Wed Jun 25 12:24:52 PDT 2025
================
@@ -129,6 +147,245 @@ bool AlwaysInlineImpl(
return Changed;
}
+/// Promote allocas to registers if possible.
+static void promoteAllocas(
+ Function *Caller, SmallPtrSetImpl<AllocaInst *> &AllocasToPromote,
+ function_ref<AssumptionCache &(Function &)> &GetAssumptionCache) {
+ if (AllocasToPromote.empty())
+ return;
+
+ SmallVector<AllocaInst *, 4> PromotableAllocas;
+ llvm::copy_if(AllocasToPromote, std::back_inserter(PromotableAllocas),
+ isAllocaPromotable);
+ if (PromotableAllocas.empty())
+ return;
+
+ DominatorTree DT(*Caller);
+ AssumptionCache &AC = GetAssumptionCache(*Caller);
+ PromoteMemToReg(PromotableAllocas, DT, &AC);
+ NumAllocasPromoted += PromotableAllocas.size();
+ // Emit a remark for the promotion.
+ OptimizationRemarkEmitter ORE(Caller);
+ DebugLoc DLoc = Caller->getEntryBlock().getTerminator()->getDebugLoc();
+ ORE.emit([&]() {
+ return OptimizationRemark(DEBUG_TYPE, "PromoteAllocas", DLoc,
+ &Caller->getEntryBlock())
+ << "Promoting " << ore::NV("NumAlloca", PromotableAllocas.size())
+ << " allocas to SSA registers in function '"
+ << ore::NV("Function", Caller) << "'";
+ });
+ LLVM_DEBUG(dbgs() << "Promoted " << PromotableAllocas.size()
+ << " allocas to registers in function " << Caller->getName()
+ << "\n");
+}
+
+/// We use a different visitation order of functions here to solve a phase
+/// ordering problem. After inlining, a caller function may have allocas that
+/// were previously used for passing reference arguments to the callee that
+/// are now promotable to registers, using SROA/mem2reg. However if we just let
+/// the AlwaysInliner continue inlining everything at once, the later SROA pass
+/// in the pipeline will end up placing phis for these allocas into blocks along
+/// the dominance frontier which may extend further than desired (e.g. loop
+/// headers). This can happen when the caller is then inlined into another
+/// caller, and the allocas end up hoisted further before SROA is run.
+///
+/// Instead what we want is to try to do, as best as we can, is to inline leaf
+/// functions into callers, and then run PromoteMemToReg() on the allocas that
+/// were passed into the callee before it was inlined.
+///
+/// We want to do this *before* the caller is inlined into another caller
+/// because we want the alloca promotion to happen before its scope extends too
+/// far because of further inlining.
+///
+/// Here's a simple pseudo-example:
+/// outermost_caller() {
+/// for (...) {
+/// middle_caller();
+/// }
+/// }
+///
+/// middle_caller() {
+/// int stack_var;
+/// inner_callee(&stack_var);
+/// }
+///
+/// inner_callee(int *x) {
+/// // Do something with x.
+/// }
+///
+/// In this case, we want to inline inner_callee() into middle_caller() and
+/// then promote stack_var to a register before we inline middle_caller() into
+/// outermost_caller(). The regular always_inliner would inline everything at
+/// once, and then SROA/mem2reg would promote stack_var to a register but in
+/// the context of outermost_caller() which is not what we want.
----------------
mtrofin wrote:
There's no plan yet with the ModuleInliner, currently it lets us experiment with alternative traversals, and some of them have been showing promise.
I'm mainly trying to understand if:
- the order of traversal matters (for this problem here)
- do all the function simplification passes need to be run after some inlining or just some? I'm guessing it's really "just a specific subset", correct?
https://github.com/llvm/llvm-project/pull/145613
More information about the llvm-branch-commits
mailing list