<div dir="ltr">Hello LLVM Developers.<div><br></div><div>This week much of my time is consumed in debugging IPRA's effect on higher level optimization specifically due to not having callee saved registers. I think it was hard but I learned a lot and LLDB helped me a lot. </div><div><br></div><div>Here is summary for this week:</div><div><br></div><div>Implementation:</div><div>============</div><div><p style="margin:0px;line-height:normal;font-family:Helvetica">Implemented a very simple check to prevent no callee saved registers optimization to functions which are recursive or may be optimized as tail call. A simple statistic added to count number of functions optimized for not having callee saved registers.</p><p style="margin:0px;line-height:normal;font-family:Helvetica"><br></p><p style="margin:0px;line-height:normal;font-family:Helvetica">Testing:</p><p style="margin:0px;line-height:normal;font-family:Helvetica">======</p><p style="margin:0px;line-height:normal;font-family:Helvetica">Debugged failing test cases due to no callee saved registers optimization. More details with examples can be found here <a href="https://groups.google.com/d/topic/llvm-dev/TSoYxeMMzxM/discussion">https://groups.google.com/d/topic/llvm-dev/TSoYxeMMzxM/discussion</a> . Now all test in llvm test-suite pass.</p><p style="margin:0px;line-height:normal;font-family:Helvetica"><br></p><p style="margin:0px;line-height:normal;font-family:Helvetica">Study:</p><p style="margin:0px;line-height:normal;font-family:Helvetica">=====</p><p style="margin:0px;line-height:normal;font-family:Helvetica">To find some ideas to improve current IPRA I read 2 papers namely “Minimizing Register Usage Penalty at Procedure Calls” by Fred C. Chow and “Register Allocation Across Procedure and Module Boundaries” by Santhanam and Odnert. </p><p style="margin:0px;line-height:normal;font-family:Helvetica">1) From the first paper I like the idea of shrink wrap analysis and LLVM currently have this optimization but the approach is completely different. I have initiated a discussion for that, it can be found here <a href="https://groups.google.com/d/topic/llvm-dev/_mZoGUQDMGo/discussion">https://groups.google.com/d/topic/llvm-dev/_mZoGUQDMGo/discussion</a> I would like to talk to Quentin Colombet more about this.</p><p style="margin:0px;line-height:normal;font-family:Helvetica"></p><p style="margin:0px;line-height:normal;font-family:Helvetica">2) From the second paper I like the idea of spill code motion, in this optimization spill due to callee saved register is pushed to less frequently called caller, but the approach mentioned in that paper requiems call frequency details and also it differs register allocation to very late, the optimization it self requires register usage details but it operates on register usage estimation done in earlier stage. This optimization also requires help from intra-procedural register allocators. I would like to have more discussion on this over IRC this Monday with my mentors.</p></div><div><br></div><div><div style="font-size:13px">Plan for next week:</div><div style="font-size:13px">==============</div><div style="font-size:13px">1) Rebase pending patches and get the review process completed.</div><div style="font-size:13px">2) Discuss how can identified ideas can be implemented with in current infrastructure.</div><div style="font-size:13px">3) Discuss how to handle indirect function call with in IPRA.</div><div style="font-size:13px"><br></div></div><div style="font-size:13px">Sincerely,</div><div style="font-size:13px">Vivek</div><div><br></div></div><div class="gmail_extra"><br><div class="gmail_quote">On Sun, Jun 26, 2016 at 5:18 PM, vivek pandya <span dir="ltr"><<a href="mailto:vivekvpandya@gmail.com" target="_blank">vivekvpandya@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr">Hello LLVM Developers,<div><br></div><div>Please follow summary of work done during this week.</div><div><br></div><div>Implementation:</div><div>============</div><div><p style="margin:0px;line-height:normal;font-family:Helvetica">During this week patch for bug fix 28144 is updated after finding more refinement in remarks calculation. As per suggestion from Matthias Braun and Hal Finkel regmask calculation code is same as MachineRegisterInfo::isPhysRegModified() except no check of isNoReturnDef() is required. So we proposed to add a bool argument SkipNoReturnDef with default value false to isPhysRegModified method so that with out breaking current use of isPhysRegModified we can reuse that code for the purpose of IPRA. The patch can be found here : <a href="http://reviews.llvm.org/D21395" target="_blank">http://reviews.llvm.org/D21395</a></p>

<p style="margin:0px;line-height:normal;font-family:Helvetica">With IPRA to improve code quality, call site with local functions are forced to have caller saved registers ( more improved heuristics will be implemented ) I have been experimenting this on my local machine and I discovered that tail call optimization is getting affected due to this optimization and some test case in test-suite fails with segmentation fault or infinite recursion due to counter value gets overwritten. Please find more details and example bug at <a href="https://groups.google.com/d/msg/llvm-dev/TSoYxeMMzxM/rb9e_M2iEwAJ" target="_blank">https://groups.google.com/d/msg/llvm-dev/TSoYxeMMzxM/rb9e_M2iEwAJ</a></p>

<p style="margin:0px;line-height:normal;font-family:Helvetica">I have also tried a very simple method to handle indirect function in IPRA but at higher optimization level, indirect function calls are getting converted to direct function calls so I request interested community member to guide me. We can have discussion about this on Monday morning (PDT). More discussion on this can be found at here : <a href="https://groups.google.com/d/msg/llvm-dev/dPk3lKwH1kU/GNfhD_jKEQAJ" target="_blank">https://groups.google.com/d/msg/llvm-dev/dPk3lKwH1kU/GNfhD_jKEQAJ</a></p><p style="margin:0px;line-height:normal;font-family:Helvetica"><br></p><p style="margin:0px;line-height:normal;font-family:Helvetica">Testing:</p><p style="margin:0px;line-height:normal;font-family:Helvetica">======</p><p style="margin:0px;line-height:normal;font-family:Helvetica">During this week I think that IPRA optimization is more stabilized after having bug fix so have run test-suite with that and also as per suggestion form Quentin Colombet I tested test-suite with only codegen order changed to bottom up on call graph.  Overall this codegen order improves runtime and compile time. I have shared results here:</p><p style="margin:0px;line-height:normal;font-family:Arial;color:rgb(18,85,204)"><span style="text-decoration:underline"><a href="https://docs.google.com/document/d/1At3QqEWmeDEXnDVz-CGh2GDlYQR3VRz3ipIfcXoLC3c/edit?usp=sharing" target="_blank">https://docs.google.com/document/d/1At3QqEWmeDEXnDVz-CGh2GDlYQR3VRz3ipIfcXoLC3c/edit?usp=sharing</a></span></p><p style="margin:0px;line-height:normal;font-family:Arial;color:rgb(35,35,35);min-height:15px"><br></p><p style="margin:0px;line-height:normal;font-family:Helvetica">


</p><p style="margin:0px;line-height:normal;font-family:Arial;color:rgb(18,85,204)"><span style="text-decoration:underline"><a href="https://docs.google.com/document/d/1hS-Cj3mEDqUCTKTYaJpoJpVOBk5E2wHK9XSGLowNPeM/edit?usp=sharing" target="_blank">https://docs.google.com/document/d/1hS-Cj3mEDqUCTKTYaJpoJpVOBk5E2wHK9XSGLowNPeM/edit?usp=sharing</a></span></p></div><div><br></div><div>Plan for next week:</div><div>==============</div><div>1) Rebase pending patches and get the review process completed.</div><div>2) Solve tail call related bug.</div><div>3) Discuss some ideas and heuristics for improving IPRA.</div><div>4) Discuss how to handle indirect function call with in IPRA.</div><div>5) More testing with llvm test-suite</div><div><br></div><div>Sincerely,</div><div>Vivek</div><div><br></div></div><div class="HOEnZb"><div class="h5"><div class="gmail_extra"><br><div class="gmail_quote">On Tue, Jun 21, 2016 at 9:26 AM, vivek pandya <span dir="ltr"><<a href="mailto:vivekvpandya@gmail.com" target="_blank">vivekvpandya@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote"><span>On Tue, Jun 21, 2016 at 1:45 AM, Matthias Braun <span dir="ltr"><<a href="mailto:matze@braunis.de" target="_blank">matze@braunis.de</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span><br>

> On Jun 20, 2016, at 12:53 PM, Sanjoy Das via llvm-dev <<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a>> wrote:<br>

><br>

> Hi Vivek,<br>

><br>

> vivek pandya via llvm-dev wrote:<br>

> >     int foo() {<br>

> >     return 12;<br>

> >     }<br>

> ><br>

> >     int bar(int a) {<br>

> >     return foo() + a;<br>

> >     }<br>

> ><br>

> >     int (*fp)() = 0;<br>

> >     int (*fp1)(int) = 0;<br>

> ><br>

> >     int main() {<br>

> >     fp = foo;<br>

> >     fp();<br>

> >     fp1 = bar;<br>

> >     fp1(15);<br>

> >     return 0;<br>

> >     }<br>

><br>

> IMO it is waste of time trying to do a better job at the IPRA level on<br>

> IR like the above ^.  LLVM should be folding the indirect calls to<br>

> direct calls at the IR level, and if it isn't that's a bug in the IR<br>

> level optimizer.<br>

</span>+1 from me.<br>

<br></blockquote></span><div>Yes at -O3 level simple indirect calls including virtual functions are getting optimized to direct call.</div><span><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

The interesting cases are the non-obvious ones (assumeing foo/bar have the same parameters). Things gets interesting once you have uncertainty in the mix. The minimally interesting case would look like this:<br>

<br>

int main() {<br>

    int (*fp)();<br>

    if (rand()) {<br>

        fp = foo;<br>

    } else {<br>

        fp = bar;<br>

    }<br>

    fp(42);<br>

}<br></blockquote><div> </div></span><div>I tried this case and my simple hack fails to optimize it :-) . This requires discussion on IRC.</div><div><br></div><div>Sincerely,</div><div>-Vivek</div><span><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

However predicting the possible targets of a call is IMO a question of computing a call graph datastructure and improving upon that. We should be sure that we discuss and implement this independently of the register allocation work!<br>

<span><font color="#888888"><br>

- Matthias<br>

<br>

</font></span></blockquote></span></div><br></div></div>

</blockquote></div><br></div>

</div></div></blockquote></div><br></div>