[PATCH] D19821: [EarlyCSE] Use MemorySSA if available.

Sun Aug 21 08:32:00 PDT 2016

On Sat, Aug 20, 2016 at 4:01 PM, Philip Reames <listmail at philipreames.com>
wrote:

> reames added a comment.
>
> Sorry for not responding to this for so long.
>
> My objection is primarily from a compile time concern.  Right now,
> EarlyCSE is a *very* cheap pass to run.  If you can keep it fast (even when
> we have to reconstruct MemorySSA) I don't object to having EarlyCSE
> MemorySSA based.  I think that is a very hard bar to pass in practice.  In
> particular, the bar is not total O3 time.  It's EarlyCSE time.

The current time to construct MemorySSA is basically nothing, even on large
and absurd testcases.
You can't make it *zero* because it does an extra extra instruction walk or
two over EarlyCSE.
But if you want is fast, EarlyCSE is the fastest pass, even on large and
absurd testcases i can find.
It doesn't change after this patch AFAICT.

I disabled LICM, since on this testcase takes over 100 seconds to do LICM.

Example at O2:

  ---User Time---   --System Time--   --User+System--   ---Wall Time---
 --- Name ---
   7.4859 ( 18.9%)   0.0096 (  1.5%)   7.4955 ( 18.7%)   7.5000 ( 18.7%)
 Loop-Closed SSA Form Pass
   4.2809 ( 10.8%)   0.0127 (  2.0%)   4.2936 ( 10.7%)   4.3000 ( 10.7%)
 Loop-Closed SSA Form Pass
   3.8189 (  9.7%)   0.0101 (  1.6%)   3.8290 (  9.5%)   3.8414 (  9.6%)
 Value Propagation
   3.7057 (  9.4%)   0.0042 (  0.7%)   3.7099 (  9.2%)   3.7136 (  9.2%)
 Value Propagation
   2.1060 (  5.3%)   0.3386 ( 53.3%)   2.4446 (  6.1%)   2.4457 (  6.1%)
 Loop Load Elimination
   2.1764 (  5.5%)   0.0127 (  2.0%)   2.1891 (  5.5%)   2.1916 (  5.5%)
 Combine redundant instructions
   2.0032 (  5.1%)   0.0029 (  0.5%)   2.0062 (  5.0%)   2.0081 (  5.0%)
 Dead Store Elimination
   1.9702 (  5.0%)   0.0168 (  2.6%)   1.9869 (  4.9%)   1.9922 (  5.0%)
 Combine redundant instructions
   1.8076 (  4.6%)   0.0024 (  0.4%)   1.8100 (  4.5%)   1.8119 (  4.5%)
 Loop-Closed SSA Form Pass
   1.7428 (  4.4%)   0.0011 (  0.2%)   1.7439 (  4.3%)   1.7443 (  4.3%)
 Loop-Closed SSA Form Pass
   1.2008 (  3.0%)   0.0113 (  1.8%)   1.2120 (  3.0%)   1.2135 (  3.0%)
 Combine redundant instructions
   1.0021 (  2.5%)   0.0116 (  1.8%)   1.0136 (  2.5%)   1.0141 (  2.5%)
 Combine redundant instructions
   0.9832 (  2.5%)   0.0121 (  1.9%)   0.9952 (  2.5%)   0.9957 (  2.5%)
 Combine redundant instructions
   0.9680 (  2.4%)   0.0110 (  1.7%)   0.9790 (  2.4%)   0.9793 (  2.4%)
 Combine redundant instructions
   0.7698 (  1.9%)   0.0069 (  1.1%)   0.7767 (  1.9%)   0.7776 (  1.9%)
 Induction Variable Simplification
   0.5041 (  1.3%)   0.0063 (  1.0%)   0.5104 (  1.3%)   0.5107 (  1.3%)
 Combine redundant instructions
   0.4878 (  1.2%)   0.0064 (  1.0%)   0.4942 (  1.2%)   0.4943 (  1.2%)
 Combine redundant instructions
....
   0.2833 (  0.7%)   0.0188 (  3.0%)   0.3021 (  0.8%)   0.3025 (  0.8%)
 Early GVN Hoisting of Expressions
   0.2149 (  0.5%)   0.0017 (  0.3%)   0.2166 (  0.5%)   0.2167 (  0.5%)
 Early CSE
   0.2136 (  0.5%)   0.0027 (  0.4%)   0.2163 (  0.5%)   0.2163 (  0.5%)
 Early CSE
   0.2036 (  0.5%)   0.0014 (  0.2%)   0.2050 (  0.5%)   0.2050 (  0.5%)
 Early CSE

Note the GVN and EarlyCSE times includes a full build of MemorySSA because
of where the passes are run.

I fully expect that the more precise analysis may speed up other passes,
> but we can't assume that happens for all inputs.  (As I write this, I'm
> recognizing that this might be too high a bar to set.  If you think I'm
> being unreasonable, argue why and what a better line should be.)
>
> Given I'm not going to have time to be active involved in this thread, I'm
> going to defer to other reviewers.  If they think this is a good idea, I
> will not actively block the thread.
>
>
> https://reviews.llvm.org/D19821
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20160821/7c6205b4/attachment.html>