<div dir="ltr">Just FYI:<div>You can use Herald (<a href="https://reviews.llvm.org/herald/">https://reviews.llvm.org/herald/</a>) to configure rules that auto-add you as a subscriber/reviewer to various changes.</div><div>Pretty much anything you can query you can do :)</div><div><br></div></div><div class="gmail_extra"><br><div class="gmail_quote">On Thu, May 4, 2017 at 11:45 AM, Kit Barton via llvm-dev <span dir="ltr"><<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Francis Visoiu Mistrih <<a href="mailto:fvisoiumistrih@apple.com">fvisoiumistrih@apple.com</a>> writes:<br>

<br>

<br>

I'm resending this as I forgot to CC llvm-dev on my reply last night....<br>

<br>

<br>

Hi Francis,<br>

<br>

Thanks for the detailed reply!<br>

I took a quick look at the patch, and I'll try to take a closer look at<br>

it tomorrow.<br>

<br>

Could you please subscribe me to future patches? Either myself or<br>

someone from my team can help with testing and/or integration on PPC if<br>

you'd like. This is something we've been struggling with for a while and<br>

are anxious to make some progress on it :)<br>

<span class="HOEnZb"><font color="#888888"><br>

Kit<br>

</font></span><div class="HOEnZb"><div class="h5"><br>

> Hi Kit,<br>

><br>

>> On May 2, 2017, at 7:54 PM, Kit Barton via llvm-dev <<a href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a>> wrote:<br>

>><br>

>><br>

>> Hi all,<br>

>><br>

>> We've seen several examples recently of performance opportunities on<br>

>> POWER if we can improve the location of save/restore code for<br>

>> callee-saved registers. Both Nemanja and myself have discussed this with<br>

>> several people, and it seems that there are two possibilities for<br>

>> improving this:<br>

>><br>

>>  1. Extend shrink wrapping to make the analysis of callee-saved<br>

>>     registers more precise.<br>

>>  2. Focus on enabling and (possibly) improving SplitCSR.<br>

>><br>

>> I would like opinions from people on the preferred way to proceed.<br>

>><br>

>> I am leaning toward improving shrink wrapping, at least as a short-term<br>

>> solution. However, I fully admit that this is because I am familiar with<br>

>> the shrink wrapping code and completely naive about SplitCSR and what<br>

>> work would be necessary to get this working well.<br>

>><br>

>> My proposal would be to implement the flow sensitive analysis described<br>

>> by Fred Chow (PLDI '88) and make the necessary extensions in shrink<br>

>> wrapping to handle multiple save/restore points. At that point we can do<br>

>> an evaluation to understand the improvements it provides and the impact<br>

>> on compile time. Once we have these results, we can look at the best way<br>

>> to enable it (e.g., option, target opt-in, higher opts, etc.).<br>

><br>

> Back in 2009, there was an implementation of Fred Chow’s algorithm, that has been removed in r193749, because it was unused and untested.<br>

><br>

> I have been working on a new implementation of Fred Chow’s algorithm for a while now.<br>

><br>

> It seems that this algorithm has been avoided because of the compile time impact that was not worth compared to the performance improvement.<br>

><br>

> The main reason is that there are probably loops in the CFG. In my<br>

> implementation, we decided that we never want to save / restore inside a loop,<br>

> so we consider loops as a single block. We are using scc_iterators instead of<br>

> MachineLoopInfo in order to handle irreducible loops as well, that are not<br>

> handled by the current SW implementation. That way, we can compute the<br>

> anticipation / availability attributes in linear time.<br>

><br>

> In terms of correctness, there is one test on X86 that still fails, and two others on AArch64, because of the way compact unwinding encodes register saves.<br>

><br>

> In terms of compile-time, there are no regressions on CTMark.<br>

><br>

> In terms of code-size, we get a 0.8% increase on X86_64 and AArch64, mostly because we cannot use push / pop anymore.<br>

><br>

> For now, we only worked on shrink-wrapping CSRs, but keep the stack setup in the entry block / return blocks, which can give worse results in some cases compared to the current shrink-wrapping pass. I am currently working on fixing this.<br>

><br>

> For execution-time, not many improvements showed up due to the stack setup and the transformation of push/pop -> mov $reg, (mem)/mov (mem), $reg which can be partially solved.<br>

><br>

> In terms of what the algorithm can do, and how it can outperform the current one, I got some stats based on where we save / restore, along with the block frequency (with PGO), and we can see a theoretical 8% improvement.<br>

><br>

> I put an early review here a while ago: <a href="https://reviews.llvm.org/D30808" rel="noreferrer" target="_blank">https://reviews.llvm.org/<wbr>D30808</a>, and I<br>

> will update it soon. As you can see, all the PrologEpilogInserter and<br>

> (X86|AArch64)<wbr>TargetFrameLowering code look horribly hacky, because so many<br>

> things assume the stack setup and callee saved saves will stick together. Fixing<br>

> all those assumptions is going to be the most tricky part of shrink-wrapping,<br>

> not the algorithm itself.<br>

><br>

> There are some cases where the current shrink-wrapping is better, and that’s in<br>

> cases like in the Fig.3 of Chow’s paper, and that can be probably solved by<br>

> using something similar to what’s described in this paper: Post Register<br>

> Allocation Spill Code Optimization by Christopher Lupo and Kent D. Wilken, which<br>

> uses the PST to optimize saves / restores placement along with a cost model.<br>

><br>

> Cheers,<br>

<br>

</div></div><div class="HOEnZb"><div class="h5">______________________________<wbr>_________________<br>

LLVM Developers mailing list<br>

<a href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a><br>

<a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" rel="noreferrer" target="_blank">http://lists.llvm.org/cgi-bin/<wbr>mailman/listinfo/llvm-dev</a><br>

</div></div></blockquote></div><br></div>