<div dir="ltr">Bringing this back up for discussion on handling exceptions.<div><br></div><div>According to the <a href="https://llvm.org/docs/InAlloca.html" target="_blank">inalloca design</a>, there should be a stackrestore after an invoke in both the non-exceptional and exceptional case (that doesn't seem to be happening in some cases as we've seen, but that's beside the point).</div><div><br></div><div>Does it make sense to model a preallocated call as handling the cleanup of the stack in normal control flow (e.g. always for a normal call, and in the non-exceptional path for an invoke)? Then @llvm.call.preallocated.teardown is only necessary in the exceptional path to cleanup the stack.</div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Thu, Apr 16, 2020 at 1:40 PM Eli Friedman <<a href="mailto:efriedma@quicinc.com" target="_blank">efriedma@quicinc.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div lang="EN-US">
<div>
<p class="MsoNormal"><u></u> <u></u></p>
<p class="MsoNormal" style="margin-left:0.5in"><b>From:</b> Reid Kleckner <<a href="mailto:rnk@google.com" target="_blank">rnk@google.com</a>>
<br>
<b>Sent:</b> Thursday, April 16, 2020 1:06 PM<br>
<b>To:</b> Eli Friedman <<a href="mailto:efriedma@quicinc.com" target="_blank">efriedma@quicinc.com</a>><br>
<b>Cc:</b> Arthur Eubanks <<a href="mailto:aeubanks@google.com" target="_blank">aeubanks@google.com</a>>; llvm-dev <<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a>><br>
<b>Subject:</b> [EXT] Re: [llvm-dev] [RFC] Replacing inalloca with llvm.call.setup and preallocated<u></u><u></u></p>
<p class="MsoNormal" style="margin-left:0.5in"><u></u> <u></u></p>
<div>
<div>
<p class="MsoNormal" style="margin-left:0.5in">On Sat, Mar 28, 2020 at 2:20 PM Eli Friedman <<a href="mailto:efriedma@quicinc.com" target="_blank">efriedma@quicinc.com</a>> wrote:<u></u><u></u></p>
</div>
<div>
<blockquote style="border-top:none;border-right:none;border-bottom:none;border-left:1pt solid rgb(204,204,204);padding:0in 0in 0in 6pt;margin:5pt 0in 5pt 4.8pt">
<div>
<div>
<div>
<div>
<div>
<p class="MsoNormal" style="margin-left:0.5in">
This would specifically be for cases where we try to rewrite the signature? I would assume we should forbid rewriting the signature of a call with an operand bundle. And once some optimization drops the bundle and preallocated marking, to allow such rewriting,
the signature doesn’t need to match anymore.<u></u><u></u></p>
</div>
</div>
</div>
</div>
</div>
</blockquote>
<div>
<p class="MsoNormal" style="margin-left:0.5in"><u></u> <u></u></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:0.5in">Yes, I really would like to enable DAE and other signature rewriting IPO transforms. Maybe today DAE doesn't run on calls with bundles, but this feature is designed to allow the non-preallocated arguments to be
removed or expanded into multiple arguments without disturbing the preallocated argument numbering.<u></u><u></u></p>
<p class="MsoNormal"><u></u> <u></u></p>
<p class="MsoNormal">This is sort of besides the point. I think we get the best of everything if we just say the bundle and preallocated attribute have to be dropped before any other IPO transforms. (If we control the function and all the callers, this transform
should be relatively simple: we just remove the attribute and bundle from the function and all the callers, and insert an llvm.call.teardown after each call where we dropped a bundle.)<u></u><u></u></p>
<p class="MsoNormal"><u></u> <u></u></p>
<p class="MsoNormal">Even if we can potentially rewrite signatures in some cases without dropping the preallocated attribute, there isn’t really any incentive to do that, as far as I know.<u></u><u></u></p>
</div>
<blockquote style="border-top:none;border-right:none;border-bottom:none;border-left:1pt solid rgb(204,204,204);padding:0in 0in 0in 6pt;margin:5pt 0in 5pt 4.8pt">
<div>
<div>
<div>
<p class="MsoNormal" style="margin-left:0.5in">
Good. It might be a good idea to try to write out an algorithm for this to ensure this works out. In particular, I’m concerned about cases where two predecessors of a basic block appear to have a different stack size (an if-then-else, or a loop backedge).
We need to make sure such cases are either invalid, or UB on entry to the block.<u></u><u></u></p>
<p class="MsoNormal" style="margin-left:0.5in">
<u></u><u></u></p>
<p class="MsoNormal" style="margin-left:0.5in">
I spent a little time thinking, and I’m not sure what rules we need to make this work out. For example, should we forbid tail-merging multiple calls to abort()? IF we should, how would we write a rule which restricts that?<u></u><u></u></p>
</div>
</div>
</div>
</blockquote>
<div>
<p class="MsoNormal" style="margin-left:0.5in"><u></u> <u></u></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:0.5in">This is actually a big open problem, and it came up again in the SEH discussion. It seems to be my fate to struggle against the LLVM IR design decision to not have scopes.<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:0.5in"><u></u> <u></u></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:0.5in">Without introducing new IR constructs, we could define a list of instructions that set the current region. We could write an algorithm for assigning regions to each block. The region is implicit in the IR. These
are the things that could create regions:<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:0.5in">- call.preallocated.setup<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:0.5in">- catchpad<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:0.5in">- cleanuppad<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:0.5in">- lifetime.start? unclear<u></u><u></u></p>
<p class="MsoNormal"><u></u> <u></u></p>
<p class="MsoNormal">stacksave/stackrestore.<u></u><u></u></p>
<p class="MsoNormal"><u></u> <u></u></p>
<p class="MsoNormal">I don’t think there’s any reason to make lifetime intrinsics properly nest; nothing really cares, as far as I know. (The current lifetime intrinsics have problems, but I don’t think that’s one of them.)<u></u><u></u></p>
<p class="MsoNormal"><u></u> <u></u></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:0.5in">Passes are required to ensure that each BB belongs to exactly one region. Each region belongs to its parent, and ending a region returns to the parent of the ending region.<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:0.5in"><br>
I don't think this idea is ready to be added to LangRef, but it is a good future direction, perhaps with new supporting IR constructs.<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:0.5in"><u></u> <u></u></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:0.5in">I think for now we have to live with the possibility that the analysis which assigns SP adjustment levels to MBBs may fail to find a unique SP level, in which case we must use a frame pointer.<u></u><u></u></p>
<p class="MsoNormal"><u></u> <u></u></p>
<p class="MsoNormal">Okay. Can we at least list out the cases where we expect this might happen?<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:0.5in"><u></u> <u></u></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:0.5in">OTOH, we can easily establish the invariant at the MIR level. We should always be able to assign each MBB a unique most recently active call site and an SP adjustment level. We can easily teach BranchFolding to
preserve this invariant. We already do it for funclets.<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:0.5in"><u></u> <u></u></p>
</div>
<blockquote style="border-top:none;border-right:none;border-bottom:none;border-left:1pt solid rgb(204,204,204);padding:0in 0in 0in 6pt;margin:5pt 0in 5pt 4.8pt">
<div>
<div>
<div>
<div>
<div>
<p class="MsoNormal" style="margin-left:0.5in">
I’m not really concerned with funny usage of calls to alloca() in call arguments, or anything like that. I’m happy to pick whatever rule is easiest for us. I’m more concerned with ensuring nothing blows up if we inline a call to a function that contains a
VLA, or something like that.<u></u><u></u></p>
</div>
</div>
</div>
</div>
</div>
</blockquote>
<div>
<p class="MsoNormal" style="margin-left:0.5in"><u></u> <u></u></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:0.5in">Sounds good. Inlining dynamic allocas and VLAs should already just work. The inliner places stacksave/stackrestore calls around the original call site, if dynamic allocas were present in the inlined code. <u></u><u></u></p>
<p class="MsoNormal"><u></u> <u></u></p>
<p class="MsoNormal">So stacksave/stackrestore regions properly nest with call.preallocated.setup regions? Sure, that makes sense.<u></u><u></u></p>
<p class="MsoNormal"><u></u> <u></u></p>
<p class="MsoNormal">-Eli<u></u><u></u></p>
</div>
</div>
</div>
</div>
</div>
</blockquote></div>