<div dir="ltr"><div dir="ltr"><div class="gmail_default" style="font-family:monospace;font-size:small;color:#000000"><br></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Sun, Apr 25, 2021 at 3:38 AM Roman Lebedev <<a href="mailto:lebedev.ri@gmail.com">lebedev.ri@gmail.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">On Sun, Apr 25, 2021 at 7:58 AM Xinliang David Li <<a href="mailto:davidxl@google.com" target="_blank">davidxl@google.com</a>> wrote:<br>

><br>

><br>

><br>

> On Sat, Apr 24, 2021 at 11:56 AM Roman Lebedev via Phabricator <<a href="mailto:reviews@reviews.llvm.org" target="_blank">reviews@reviews.llvm.org</a>> wrote:<br>

>><br>

>> lebedev.ri added a comment.<br>

>><br>

>> In D101229#2714853 <<a href="https://reviews.llvm.org/D101229#2714853" rel="noreferrer" target="_blank">https://reviews.llvm.org/D101229#2714853</a>>, @davidxl wrote:<br>

>><br>

>> > Changing the default threshold needs lots of benchmarking.<br>

>><br>

>><br>

>><br>

>> > For this particular case, IMO the better way is to enhance inline cost<br>

>> > analysis to give callsite more bonus if it enables SROA in call context.<br>

>><br>

>> I have thought about that too, yes.<br>

>><br>

>> > The analysis needs to be careful such that if there is another callsite<br>

>> > that blocks SROA, and that callsites can never be inlined, then the bonus<br>

>> > can not be applied.<br>

>><br>

>> So when inlining call to `curr_callee(arg)` from `entry()`,<br>

>> and we've deduced that `arg` is an alloca within `entry()`,<br>

>> we need to run an analysis on `entry()`, and verify that the alloca<br>

>> is not used by anything that would prevent SROA, that's obvious to me.<br>

>><br>

>> The caveat that is a little murky to me still, *how* specifically<br>

>> should we deal with the cases when the alloca is passed as an argument<br>

>> to some other callee? I don't suppose we want to actually recurse into it?<br>

<br>

Thank you for sticking with me!<br>

<br>

> I suppose a preparation step analysis is needed:<br>

> For each SROA candidate (which does not have its address used<br>

> in an intractable way) in the caller function, compute the list<br>

> of call sites with its address passed in.<br>

><br>

> After all callsites are considered for inlining, the rejected callsites<br>

> can be re-examined again. Say we have callsite A and B rejected,<br>

> but they are associated with SROA candidate X.<br>

So far this is reasonably straight-forward.<br>

<br>

> If A and B both can be inlined by applying the bonus,<br>

> then A and B will be inlined.<br>

And that's where things get confusing.<br>

We don't actually compute the full cost of inlining a certain callsite,<br>

we stop as soon as it's obvious that we can't inline it.<br>

<br></blockquote><div><br></div><div><div class="gmail_default" style="font-family:monospace;font-size:small;color:rgb(0,0,0)">This can be changed so that the cost computation can continue just above the threshold+bonus. Failed callsite with cost > threshold + bonus won't be considered again. Those with cost between threshold and threshold + bonus are candidates.</div></div><div><br></div><div><div class="gmail_default" style="font-family:monospace;font-size:small;color:rgb(0,0,0)">Note that we can enable the above only when the callsite takes alloca address as argument(s) to avoid compile time waste.</div><br></div><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

So i guess what you are saying is that we need to apply the bonus,<br>

rerun the analysis, check that we can inline,<br>

check that the SROA doesn't get disabled in the callee,<br>

repeat that for all the rejected callees, and then inline them all at once?<br>

<br></blockquote><div><br></div><div><div class="gmail_default" style="font-family:monospace;font-size:small;color:rgb(0,0,0)">See above -- there is no need to rerun the analysis.</div><br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

> Ideally, the analysis should recursively considering callsites in the callee,<br>

> but the compile time cost can be too large.<br>

> The downside of not considering them is that the additional inlinings<br>

> (due to bonus) may not necessarily result in SROA to happen in the caller.<br>

Yeah, i guess we don't want this to be recursive.<br>

<br></blockquote><div><br></div><div class="gmail_default" style="font-family:monospace;font-size:small;color:rgb(0,0,0)">David</div><div class="gmail_default" style="font-family:monospace;font-size:small;color:rgb(0,0,0)"></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

> David<br>

Roman<br>

<br>

>><br>

>><br>

>> > David<br>

>><br>

>><br>

>><br>

>><br>

>> Repository:<br>

>>   rG LLVM Github Monorepo<br>

>><br>

>> CHANGES SINCE LAST ACTION<br>

>>   <a href="https://reviews.llvm.org/D101229/new/" rel="noreferrer" target="_blank">https://reviews.llvm.org/D101229/new/</a><br>

>><br>

>> <a href="https://reviews.llvm.org/D101229" rel="noreferrer" target="_blank">https://reviews.llvm.org/D101229</a><br>

>><br>

</blockquote></div></div>