[llvm-branch-commits] AlwaysInliner: A new inlining algorithm to interleave alloca promotion with inlines. (PR #145613)
Amara Emerson via llvm-branch-commits
llvm-branch-commits at lists.llvm.org
Wed Jun 25 12:30:24 PDT 2025
================
@@ -129,6 +147,245 @@ bool AlwaysInlineImpl(
return Changed;
}
+/// Promote allocas to registers if possible.
+static void promoteAllocas(
+ Function *Caller, SmallPtrSetImpl<AllocaInst *> &AllocasToPromote,
+ function_ref<AssumptionCache &(Function &)> &GetAssumptionCache) {
+ if (AllocasToPromote.empty())
+ return;
+
+ SmallVector<AllocaInst *, 4> PromotableAllocas;
+ llvm::copy_if(AllocasToPromote, std::back_inserter(PromotableAllocas),
+ isAllocaPromotable);
+ if (PromotableAllocas.empty())
+ return;
+
+ DominatorTree DT(*Caller);
+ AssumptionCache &AC = GetAssumptionCache(*Caller);
+ PromoteMemToReg(PromotableAllocas, DT, &AC);
+ NumAllocasPromoted += PromotableAllocas.size();
+ // Emit a remark for the promotion.
+ OptimizationRemarkEmitter ORE(Caller);
+ DebugLoc DLoc = Caller->getEntryBlock().getTerminator()->getDebugLoc();
+ ORE.emit([&]() {
+ return OptimizationRemark(DEBUG_TYPE, "PromoteAllocas", DLoc,
+ &Caller->getEntryBlock())
+ << "Promoting " << ore::NV("NumAlloca", PromotableAllocas.size())
+ << " allocas to SSA registers in function '"
+ << ore::NV("Function", Caller) << "'";
+ });
+ LLVM_DEBUG(dbgs() << "Promoted " << PromotableAllocas.size()
+ << " allocas to registers in function " << Caller->getName()
+ << "\n");
+}
+
+/// We use a different visitation order of functions here to solve a phase
+/// ordering problem. After inlining, a caller function may have allocas that
+/// were previously used for passing reference arguments to the callee that
+/// are now promotable to registers, using SROA/mem2reg. However if we just let
+/// the AlwaysInliner continue inlining everything at once, the later SROA pass
+/// in the pipeline will end up placing phis for these allocas into blocks along
+/// the dominance frontier which may extend further than desired (e.g. loop
+/// headers). This can happen when the caller is then inlined into another
+/// caller, and the allocas end up hoisted further before SROA is run.
+///
+/// Instead what we want is to try to do, as best as we can, is to inline leaf
+/// functions into callers, and then run PromoteMemToReg() on the allocas that
+/// were passed into the callee before it was inlined.
+///
+/// We want to do this *before* the caller is inlined into another caller
+/// because we want the alloca promotion to happen before its scope extends too
+/// far because of further inlining.
+///
+/// Here's a simple pseudo-example:
+/// outermost_caller() {
+/// for (...) {
+/// middle_caller();
+/// }
+/// }
+///
+/// middle_caller() {
+/// int stack_var;
+/// inner_callee(&stack_var);
+/// }
+///
+/// inner_callee(int *x) {
+/// // Do something with x.
+/// }
+///
+/// In this case, we want to inline inner_callee() into middle_caller() and
+/// then promote stack_var to a register before we inline middle_caller() into
+/// outermost_caller(). The regular always_inliner would inline everything at
+/// once, and then SROA/mem2reg would promote stack_var to a register but in
+/// the context of outermost_caller() which is not what we want.
----------------
aemerson wrote:
Yes the traversal order matters here, because for optimal codegen we want mem2reg to happen between the inner->middle and middle->outer inlines. If you don't the other way around mem2reg can't do anything until the final inner->outer inline and by that point it's too late.
For now I think only this promotion is a known issue, I don't know of general issues with simplification.
https://github.com/llvm/llvm-project/pull/145613
More information about the llvm-branch-commits
mailing list