[llvm-dev] [RFC] Replacing inalloca with llvm.call.setup and preallocated

Thu Apr 16 13:05:35 PDT 2020

On Sat, Mar 28, 2020 at 2:20 PM Eli Friedman <efriedma at quicinc.com> wrote:

> This would specifically be for cases where we try to rewrite the
> signature?  I would assume we should forbid rewriting the signature of a
> call with an operand bundle.  And once some optimization drops the bundle
> and preallocated marking, to allow such rewriting, the signature doesn’t
> need to match anymore.
>

Yes, I really would like to enable DAE and other signature rewriting IPO
transforms. Maybe today DAE doesn't run on calls with bundles, but this
feature is designed to allow the non-preallocated arguments to be removed
or expanded into multiple arguments without disturbing the preallocated
argument numbering.

> Ultimately, I think it’s worth some effort here to try to avoid blocking
> optimizations like jump threading.  That said, if you want to make the
> calls “noduplicate”, it isn’t that terrible of an alternative.
>

Does "noduplicate" block inlining, though? Maybe we really want
"convergent"? In any case, I'm OK with powering down jump threading to pick
up IPO, which is incompatible with today's inalloca.

> Good.  It might be a good idea to try to write out an algorithm for this
> to ensure this works out.  In particular, I’m concerned about cases where
> two predecessors of a basic block appear to have a different stack size (an
> if-then-else, or a loop backedge).  We need to make sure such cases are
> either invalid, or UB on entry to the block.
>
>
>
> I spent a little time thinking, and I’m not sure what rules we need to
> make this work out.  For example, should we forbid tail-merging multiple
> calls to abort()?  IF we should, how would we write a rule which restricts
> that?
>

This is actually a big open problem, and it came up again in the SEH
discussion. It seems to be my fate to struggle against the LLVM IR design
decision to not have scopes.

Without introducing new IR constructs, we could define a list of
instructions that set the current region. We could write an algorithm for
assigning regions to each block. The region is implicit in the IR. These
are the things that could create regions:
- call.preallocated.setup
- catchpad
- cleanuppad
- lifetime.start? unclear
Passes are required to ensure that each BB belongs to exactly one region.
Each region belongs to its parent, and ending a region returns to the
parent of the ending region.

I don't think this idea is ready to be added to LangRef, but it is a good
future direction, perhaps with new supporting IR constructs.

I think for now we have to live with the possibility that the analysis
which assigns SP adjustment levels to MBBs may fail to find a unique SP
level, in which case we must use a frame pointer.

OTOH, we can easily establish the invariant at the MIR level. We should
always be able to assign each MBB a unique most recently active call site
and an SP adjustment level. We can easily teach BranchFolding to preserve
this invariant. We already do it for funclets.

I’m not really concerned with funny usage of calls to alloca() in call
> arguments, or anything like that. I’m happy to pick whatever rule is
> easiest for us.  I’m more concerned with ensuring nothing blows up if we
> inline a call to a function that contains a VLA, or something like that.
>

Sounds good. Inlining dynamic allocas and VLAs should already just work.
The inliner places stacksave/stackrestore calls around the original call
site, if dynamic allocas were present in the inlined code.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200416/21452d2a/attachment.html>