<div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote">On Sat, Aug 20, 2016 at 4:01 PM, Philip Reames <span dir="ltr"><<a href="mailto:listmail@philipreames.com" target="_blank">listmail@philipreames.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">reames added a comment.<br>
<br>
Sorry for not responding to this for so long.<br>
<br>
My objection is primarily from a compile time concern. Right now, EarlyCSE is a *very* cheap pass to run. If you can keep it fast (even when we have to reconstruct MemorySSA) I don't object to having EarlyCSE MemorySSA based. I think that is a very hard bar to pass in practice. In particular, the bar is not total O3 time. It's EarlyCSE time. </blockquote><div><br></div><div>The current time to construct MemorySSA is basically nothing, even on large and absurd testcases.</div><div>You can't make it *zero* because it does an extra extra instruction walk or two over EarlyCSE.</div><div>But if you want is fast, EarlyCSE is the fastest pass, even on large and absurd testcases i can find.</div><div>It doesn't change after this patch AFAICT.</div><div><br class="">I disabled LICM, since on this testcase takes over 100 seconds to do LICM.<br></div><div><br></div><div>Example at O2:</div><div><br><br></div><div><div> ---User Time--- --System Time-- --User+System-- ---Wall Time--- --- Name ---</div><div> 7.4859 ( 18.9%) 0.0096 ( 1.5%) 7.4955 ( 18.7%) 7.5000 ( 18.7%) Loop-Closed SSA Form Pass</div><div> 4.2809 ( 10.8%) 0.0127 ( 2.0%) 4.2936 ( 10.7%) 4.3000 ( 10.7%) Loop-Closed SSA Form Pass</div><div> 3.8189 ( 9.7%) 0.0101 ( 1.6%) 3.8290 ( 9.5%) 3.8414 ( 9.6%) Value Propagation</div><div> 3.7057 ( 9.4%) 0.0042 ( 0.7%) 3.7099 ( 9.2%) 3.7136 ( 9.2%) Value Propagation</div><div> 2.1060 ( 5.3%) 0.3386 ( 53.3%) 2.4446 ( 6.1%) 2.4457 ( 6.1%) Loop Load Elimination</div><div> 2.1764 ( 5.5%) 0.0127 ( 2.0%) 2.1891 ( 5.5%) 2.1916 ( 5.5%) Combine redundant instructions</div><div> 2.0032 ( 5.1%) 0.0029 ( 0.5%) 2.0062 ( 5.0%) 2.0081 ( 5.0%) Dead Store Elimination</div><div> 1.9702 ( 5.0%) 0.0168 ( 2.6%) 1.9869 ( 4.9%) 1.9922 ( 5.0%) Combine redundant instructions</div><div> 1.8076 ( 4.6%) 0.0024 ( 0.4%) 1.8100 ( 4.5%) 1.8119 ( 4.5%) Loop-Closed SSA Form Pass</div><div> 1.7428 ( 4.4%) 0.0011 ( 0.2%) 1.7439 ( 4.3%) 1.7443 ( 4.3%) Loop-Closed SSA Form Pass</div><div> 1.2008 ( 3.0%) 0.0113 ( 1.8%) 1.2120 ( 3.0%) 1.2135 ( 3.0%) Combine redundant instructions</div><div> 1.0021 ( 2.5%) 0.0116 ( 1.8%) 1.0136 ( 2.5%) 1.0141 ( 2.5%) Combine redundant instructions</div><div> 0.9832 ( 2.5%) 0.0121 ( 1.9%) 0.9952 ( 2.5%) 0.9957 ( 2.5%) Combine redundant instructions</div><div> 0.9680 ( 2.4%) 0.0110 ( 1.7%) 0.9790 ( 2.4%) 0.9793 ( 2.4%) Combine redundant instructions</div><div> 0.7698 ( 1.9%) 0.0069 ( 1.1%) 0.7767 ( 1.9%) 0.7776 ( 1.9%) Induction Variable Simplification</div><div> 0.5041 ( 1.3%) 0.0063 ( 1.0%) 0.5104 ( 1.3%) 0.5107 ( 1.3%) Combine redundant instructions</div><div> 0.4878 ( 1.2%) 0.0064 ( 1.0%) 0.4942 ( 1.2%) 0.4943 ( 1.2%) Combine redundant instructions</div></div><div>....</div><div><div> 0.2833 ( 0.7%) 0.0188 ( 3.0%) 0.3021 ( 0.8%) 0.3025 ( 0.8%) Early GVN Hoisting of Expressions</div><div> 0.2149 ( 0.5%) 0.0017 ( 0.3%) 0.2166 ( 0.5%) 0.2167 ( 0.5%) Early CSE</div><div> 0.2136 ( 0.5%) 0.0027 ( 0.4%) 0.2163 ( 0.5%) 0.2163 ( 0.5%) Early CSE</div><div> 0.2036 ( 0.5%) 0.0014 ( 0.2%) 0.2050 ( 0.5%) 0.2050 ( 0.5%) Early CSE</div></div><div><br></div><div><br></div><div>Note the GVN and EarlyCSE times includes a full build of MemorySSA because of where the passes are run.</div><div><br></div><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"> I fully expect that the more precise analysis may speed up other passes, but we can't assume that happens for all inputs. (As I write this, I'm recognizing that this might be too high a bar to set. If you think I'm being unreasonable, argue why and what a better line should be.)<br>
<br>
Given I'm not going to have time to be active involved in this thread, I'm going to defer to other reviewers. If they think this is a good idea, I will not actively block the thread.<br>
<br>
<br>
<a href="https://reviews.llvm.org/D19821" rel="noreferrer" target="_blank">https://reviews.llvm.org/<wbr>D19821</a><br>
<br>
<br>
<br>
</blockquote></div><br></div></div>