<div dir="ltr">I mostly have concerns about the robustness of this approach. For example, it's really hard to implement getSPAdjust correctly. Right now it assumes that call sequences contain a restricted set of MI, and it's not clear to me that that assumption always holds.<div><br></div><div>If we start from pushes, it should be really easy to use local or global heuristics to transform back to a reserved frame. I'm willing to give up on the multi-bb call sequence case caused by cmov lowering. If you all call sequences are in a single MBB, it should be really easy to insert the right adjustments.<div><br></div><div>I also like the idea of having the push nodes available in the DAG so we can get better instruction selection. Maybe the combines that you wrote out on MI are enough for most code, but to me it just seems like we're reimplementing some of isel.</div></div><div><br></div><div>Anyway, I don't really want to block this over this disagreement. I'm not the one doing the implementation. I certainly don't want to spend a week figuring out how to write tablegen patters for PUSH. The store to push conversion pass is not that complicated, and we can go back and reevaluate in the future.</div></div><div class="gmail_extra"><br><div class="gmail_quote">On Sun, Jan 25, 2015 at 1:21 AM, Kuperstein, Michael M <span dir="ltr"><<a href="mailto:michael.m.kuperstein@intel.com" target="_blank">michael.m.kuperstein@intel.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

<div lang="EN-US" link="blue" vlink="purple">

<div>

<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d">Hi Chandler,<u></u><u></u></span></p>

<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d"><u></u> <u></u></span></p>

<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d">This is something that Reid and I talked about on IRC, but I don’t think we came to a conclusion both of us were happy with (hence Reid’s “lgtm with reservations”,

 I guess :-) )<u></u><u></u></span></p>

<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d"><u></u> <u></u></span></p>

<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d">First, I don’t think the decision on whether to use movs or pushes belongs in the DAG.

<u></u><u></u></span></p>

<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d">The decision on whether a call-site should use movs or pushes needs to be aware of its context, because having even one call-site use pushes means we will not

 have a reserved call frame, which affects the way all other call sites are treated as well. This patch makes the decision based on global attributes only (opt for size vs. speed, stack alignment), but the next step will be to make it based on an analysis of

 the call-sites – e.g. even with stack alignment of 16, it can still often be a win, depending on just how many of the function calls we can actually transform, and how many memory arguments each call has.<u></u><u></u></span></p>

<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d"><u></u> <u></u></span></p>

<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d">So the way I envision the next step is that the pass will:<u></u><u></u></span></p>

<p><u></u><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d"><span>a)<span style="font:7.0pt "Times New Roman"">     

</span></span></span><u></u><span dir="LTR"></span><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d">Collect the necessary information from all call sites in the function.<u></u><u></u></span></p>

<p><u></u><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d"><span>b)<span style="font:7.0pt "Times New Roman"">     

</span></span></span><u></u><span dir="LTR"></span><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d">Make a judgment on whether the transformation is worth it – in terms of size for Os/Oz, in terms of performance for other opt

 levels.<u></u><u></u></span></p>

<p><u></u><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d"><span>c)<span style="font:7.0pt "Times New Roman"">      

</span></span></span><u></u><span dir="LTR"></span><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d">Perform the transformation.<u></u><u></u></span></p>

<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d">I don’t see how we can do this on the DAG.<a name="14b2065ef8f10edc__MailEndCompose"><u></u><u></u></a></span></p>

<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d"><u></u> <u></u></span></p>

<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d">If I understand Reid’s last suggestion, he proposed to flip the default – that is, emit pushes in the DAG, and have an MI pass that does the opposite (push

 -> mov) transformation if necessary. <u></u><u></u></span></p>

<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d"><u></u> <u></u></span></p>

<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d">I don’t believe that removes a lot of complexity or would improve performance.<u></u><u></u></span></p>

<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d">The code in PEI, InstrInfo and FrameLowering is just a side effect on not being able to rely on a 0 SPAdj in PEI anymore (that is, canSimplifyCallFramePseudos()

 can now return false), and is needed regardless of how the transformation is performed. And we will still need the heuristic decision.

<u></u><u></u></span></p>

<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d">Some of the logic in looking for sequences where the conversion is possible will disappear, but I think a lot of it will remain as conditions on the incoming

 operand DAG nodes. And since we don’t want to transform each push into a “adjust esp, mov” but rather want to group all the esp adjustments back into the ADJCALLSTACKs, we will still need to have code in the pass that make sure this is safe w.r.t to the final

 sequence.<u></u><u></u></span></p>

<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d">The main benefit I see is that we will no longer need to have the folding code – rather, we will have to unfold PUSH32rmm, which is simpler. However, I hope

 I can eventually get rid of the folding here by teaching PeepholeOptimizer to be smarter about this.<u></u><u></u></span></p>

<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d"><u></u> <u></u></span></p>

<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d">On the other hand, X86TargetLowering::LowerCall() is already, IMHO, a fairly complex piece of code, and I’d rather avoid making it even more complex.

<u></u><u></u></span></p>

<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d">Conceptually, I’d prefer that LowerCall() did standard mov-based lowering in all cases like it does now (we aren’t always going to lower to pushes anyway –

 it doesn’t really make sense for x864-64) and treat pushes as an optimization where available.<u></u><u></u></span></p>

<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d"><u></u> <u></u></span></p>

<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d">What do you think?<u></u><u></u></span></p>

<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d"><u></u> <u></u></span></p>

<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d">Michael<u></u><u></u></span></p>

<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d"><u></u> <u></u></span></p>

<p class="MsoNormal"><b><span style="font-size:10.0pt;font-family:"Tahoma","sans-serif"">From:</span></b><span style="font-size:10.0pt;font-family:"Tahoma","sans-serif""> Chandler Carruth [mailto:<a href="mailto:chandlerc@google.com" target="_blank">chandlerc@google.com</a>]

<br>

<b>Sent:</b> Thursday, January 22, 2015 23:07<br>

<b>To:</b> <a href="mailto:reviews%2BD6789%2Bpublic%2Ba4ec4af5a5133e84@reviews.llvm.org" target="_blank">reviews+D6789+public+a4ec4af5a5133e84@reviews.llvm.org</a><br>

<b>Cc:</b> Kuperstein, Michael M; Nadav Rotem; Demikhovsky, Elena; Commit Messages and Patches for LLVM<br>

<b>Subject:</b> Re: [PATCH] [X86] Convert esp-relative movs of function arguments to pushes, step 2<u></u><u></u></span></p><div><div class="h5">

<p class="MsoNormal"><u></u> <u></u></p>

<div>

<div>

<p class="MsoNormal" style="line-height:115%"><u></u> <u></u></p>

<div>

<p class="MsoNormal" style="line-height:115%">On Thu, Jan 22, 2015 at 12:57 PM, Reid Kleckner <<a href="mailto:rnk@google.com" target="_blank">rnk@google.com</a>> wrote:<u></u><u></u></p>

<div>

<p class="MsoNormal" style="line-height:115%">lgtm<br>

<br>

I still think forming pushes prior to isel is the way to go long term. It's a lot easier to convert pushes to 'load, SP adjust, store' than it is to go the other way.<u></u><u></u></p>

</div>

</div>

<p class="MsoNormal" style="line-height:115%"><br>

If that is the right way to go, then please don't add this complexity and compile time hit. It doesn't make sense to add code, maintain that code, fix bugs in it, and deal with the compile time when it wouldn't be needed if we took the other approach.<u></u><u></u></p>

</div>

</div>

</div></div></div>

<p>---------------------------------------------------------------------<br>

Intel Israel (74) Limited</p>

<p>This e-mail and any attachments may contain confidential material for<br>

the sole use of the intended recipient(s). Any review or distribution<br>

by others is strictly prohibited. If you are not the intended<br>

recipient, please contact the sender and delete all copies.</p></div>

</blockquote></div><br></div>