<div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote">On Wed, Nov 16, 2016 at 2:03 AM, David Chisnall via llvm-dev <span dir="ltr"><<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">This is really great to see, as I’ve spent far too much of my life over the past two years fighting with undocumented assumptions made by GVN.  A couple of quick questions about the new GVN, based on problems I’ve had with the old one:<br>

<br>

Does it assume that it’s always safe to widen a load (or store) to a power of two? </blockquote><div><br></div><div>I don't believe old gvn does widening any more, and new gvn certainly doesn't.</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"> For our target, this is only sound if you can show that the pointer was used to read all of the bytes that you are loading (we have byte-granularity memory safety).  Old GVN has no hooks for targets to specify whether this is safe and so is implicitly assuming a page-based MMU.  This optimisation is also unsound for M-profile ARM cores, though will fail occasionally there, whereas it fails deterministically for us.<br>

<br>

Does it make any assumptions about the layout of memory in pointers?  </blockquote><div>Not that i know of</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Old GVN treats pointers as integers and assumes that it’s safe to do partial stores to them.</blockquote><div>Can you give an example?</div><div>Is this store coercion or something?<br> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">  As a prerequisite for memory safety, we must be able to guarantee atomic updates to pointers and we had to hack GVN to disable a bunch of these things.  In LLVM IR, pointers are opaque and there is no guarantee that their representation is the same as a same-sized integer, but old GVN makes this assumption.<br></blockquote><div><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

<br>

David<br>

<div><div class="h5"><br>

> On 16 Nov 2016, at 06:49, Davide Italiano via llvm-dev <<a href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a>> wrote:<br>

><br>

> Hi,<br>

> we would like to propose  a new Global Value Numbering pass in LLVM.<br>

> The ideas/code are from Daniel Berlin (with a minor overhaul/splitting<br>

> into submittable patches from me). The code has been around for a<br>

> while (2012 or before), and we think it's getting ready to be<br>

> committed upstream.<br>

><br>

> ### Motivation<br>

><br>

> To put things into context: my personal motivation for having a new<br>

> GVN/PRE algorithm is LTO.<br>

> It's not a secret that LLVM is getting slower and slower release after<br>

> release, as Rafael discovered/pointed out in March [1] (and probably<br>

> many others found out). I personally took a shot at profiling LTO on<br>

> many internal/opensource applications (including clang itself) and<br>

> noticed that GVN always show in the top-3 passes (and it's generally<br>

> the pass where we spend most of the time in the middle-end). There are<br>

> cases (extreme) where 90% of the compile time goes in GVN.<br>

><br>

> Example:<br>

> ===---------------------------<wbr>------------------------------<wbr>----------------===<br>

>                      ... Pass execution timing report ...<br>

> ===---------------------------<wbr>------------------------------<wbr>----------------===<br>

>  Total Execution Time: 684.7622 seconds (683.7141 wall clock)<br>

><br>

>   ---User Time---   --System Time--   --User+System--   ---Wall<br>

> Time---  --- Name ---<br>

>  130.2558 ( 20.0%)   6.3128 ( 18.7%)  136.5686 ( 19.9%)  137.6145 (<br>

> 20.1%)  X86 DAG->DAG Instruction Selection<br>

>  55.4695 (  8.5%)   0.5501 (  1.6%)  56.0196 (  8.2%)  56.1049 (<br>

> 8.2%)  Function Integration/Inlining<br>

>  42.3492 (  6.5%)   0.0364 (  0.1%)  42.3856 (  6.2%)  42.8676 (<br>

> 6.3%)  Global Value Numbering<br>

><br>

> ### Problems in the current GVN<br>

><br>

> There are some issues in the current GVN infrastructure. A<br>

> non-exhaustive list (I'm pretty sure people have their list of<br>

> complains, these are the ones I care about the most).<br>

> * GVN is very slow for large test cases<br>

> * GVN doesn't do a real analysis, instead it eliminates as it goes, so<br>

> it's hard to reason about it.<br>

> * GVN doesn't perform any phi predication, i.e. doesn't know about phi<br>

> nodes, so later passes have to do some extra work to clean up<br>

> * There are bugs, e.g. [2] which would require a rewrite of PRE to be fixed.<br>

><br>

> ### NewGVN<br>

><br>

> The new algorithm implements the ideas described in [3] with some<br>

> engineering optimizations by Dan (for example the set of touched<br>

> instructions is represented using a Bitvector instead of a set because<br>

> it's not uncommon for large functions where a predicate change that<br>

> thousands of instructions need to be changed, and we both measured 30%<br>

> of the whole pass time spent just modifying the set).<br>

> The code pretty much maps what the paper describe so I won't try to<br>

> repeat it here =)<br>

><br>

> Some advantages of NewGVN:<br>

> * GVN performs a real analysis (which is now part of the pass itself<br>

> but potentially could be split and used as an utility by other<br>

> passes). For example, Quentin/Dan pointed out that outlining at the IR<br>

> level is hard without a proper value numbering analysis.<br>

> * On large testcases, It's faster than current GVN, even though didn't<br>

> go through a lot of profiling/optimization lately (I found it up to<br>

> 50% faster in compile time on some internal benchmarks). There are<br>

> some places were we can improve. For example, we spend a lot of time<br>

> inside Simplify* functions. Another considerable chunk of the time is<br>

> spent inside MemorySSA, but this is going to change once we preserve<br>

> MemSSA more aggressively.<br>

> * The code is easier to understand (at least to me)<br>

> [...]<br>

><br>

> Some current limitations of NewGVN:<br>

> * NewGVN doesn't do everything that the current GVN does. There are<br>

> plans to implement missing features and more.<br>

> * On small testcases NewGVN may be slower as it does some work upfront<br>

> that the current GVN doesn't do. NewGVN is probably less lazy than it<br>

> should be, but it didn't matter for us so we didn't consider it a<br>

> priority. If somebody cares and finds a case where NewGVN<br>

> substantially regresses, speak up.<br>

><br>

> The initial code can be found here <a href="https://reviews.llvm.org/D26224" rel="noreferrer" target="_blank">https://reviews.llvm.org/<wbr>D26224</a><br>

> The current patch includes only the "core" GVN algorithm, i.e. the<br>

> expression framework/the logic to perform congruence finding. Other<br>

> pieces (e.g. PRE will build on top of that).<br>

><br>

> My rough plan is:<br>

> * Try to get this reviewed/tested and in tree.<br>

> * Fix miscompiles/bugs (and performance problems as/if they arise).<br>

> * Build other pieces of the algorithm on top of what we have. My<br>

> immediate concern will be implementing support for llvm.assume and<br>

> load coercion. Dan said he will try to find the time to work on the<br>

> predicate handling (actually, there's already a branch<br>

> <a href="https://github.com/dberlin/llvm-gvn-rewrite/tree/newgvn-predicates" rel="noreferrer" target="_blank">https://github.com/dberlin/<wbr>llvm-gvn-rewrite/tree/newgvn-<wbr>predicates</a>).<br>

><br>

> Please let us know what you think. Any feedback/review comment/testing<br>

> is very appreciated!<br>

><br>

> Thanks!<br>

><br>

> [1] <a href="http://lists.llvm.org/pipermail/llvm-dev/2016-March/096488.html" rel="noreferrer" target="_blank">http://lists.llvm.org/<wbr>pipermail/llvm-dev/2016-March/<wbr>096488.html</a><br>

> [2] <a href="https://llvm.org/bugs/show_bug.cgi?id=30692#c11" rel="noreferrer" target="_blank">https://llvm.org/bugs/show_<wbr>bug.cgi?id=30692#c11</a><br>

> [3] <a href="http://dl.acm.org/citation.cfm?id=512536" rel="noreferrer" target="_blank">http://dl.acm.org/citation.<wbr>cfm?id=512536</a><br>

><br>

> --<br>

> Davide<br>

><br>

> "There are no solved problems; there are only problems that are more<br>

> or less solved" -- Henri Poincare<br>

</div></div>> ______________________________<wbr>_________________<br>

> LLVM Developers mailing list<br>

> <a href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a><br>

> <a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" rel="noreferrer" target="_blank">http://lists.llvm.org/cgi-bin/<wbr>mailman/listinfo/llvm-dev</a><br>

<br>

______________________________<wbr>_________________<br>

LLVM Developers mailing list<br>

<a href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a><br>

<a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" rel="noreferrer" target="_blank">http://lists.llvm.org/cgi-bin/<wbr>mailman/listinfo/llvm-dev</a><br>

</blockquote></div><br></div></div>