[llvm-dev] [RFC] NewGVN

Tue Nov 15 23:32:16 PST 2016

On Tue, Nov 15, 2016 at 11:29 PM, Daniel Berlin <dberlin at dberlin.org> wrote:
> First, thanks. This is a very very long time coming :)
> Second, for those watching, note that pretty much all of the improvements
> and missing cases, including load forwarding/coercion, etc, actually are
> done.
> It's more a matter of cleaning them up, breaking it down, and submitting it,
> than "implementing them".

Oh sure, bad wording. For those interested in the other pieces, this
is Dan's branch
https://github.com/dberlin/llvm-gvn-rewrite

> The main experimentation at this point is "can we do it cleaner" not "can we
> do it" :)
>
> It's also important to note that this new GVN also treats loads/stores in a
> unified way with scalars, unlike current GVN (which has no load or store
> value numbering).
>
> So it will happily discover complex load/store relations (though there is
> some improvements we can still make here)
> For example:
>
> int vnum_test8(int *data)
> {
>   int i;
>   int stop = data[3];
>   int m = data[4];
>   int n = m;
>   int p;
>   for (i=0; i<stop; i++) {
>     int k = data[2];
>     data[k] = 2;
>     data[0] = m - n;
>     k = data[1];
>     m = m + k;
>     n = n + k;
>     p = data[0];
>   }
>   return p;
> }
>
> LLVM's current GVN will eliminate a single load here[1]
> NewGVN will calculate that m and n are equal, that m-n is 0, that p is 0
>
> It's not quite perfect yet, i haven't fixed store handling, so the following
> is missed:
> int a;
> int *p;
> // LLVM is too smart and if we don't do this, realizes *p is a store to
> undef
> void foo(){
>     p = &a;
> }
> int main(int argc, char **argv) {
>   int result;
>   foo();
>   *p = 2;
>   if (argc)
>     *p = 2;
>   result = *p;
>   return result;
> }
>
> Here, current LLVM GVN will do nothing, because it can't understand anything
> really about the stores.
> GCC's GVN will determine result is 2.
> NewGVN is not quite that smart yet (it require a little work to what we do
> to stores, and value numbering memory ssa versions)
>
> This issue compounds if you have conditional stores of the same value.
>
> So, for example, if you add:
>
> if (i < 30)
>   data[0] = 0;
>
> to the first case.
>
> GCC can still determine p is 0.
>
> Currently, NewGVN cannot.
>
> --Dan
>
> On Tue, Nov 15, 2016 at 10:49 PM, Davide Italiano <davide at freebsd.org>
> wrote:
>>
>> Hi,
>> we would like to propose  a new Global Value Numbering pass in LLVM.
>> The ideas/code are from Daniel Berlin (with a minor overhaul/splitting
>> into submittable patches from me). The code has been around for a
>> while (2012 or before), and we think it's getting ready to be
>> committed upstream.
>>
>> ### Motivation
>>
>> To put things into context: my personal motivation for having a new
>> GVN/PRE algorithm is LTO.
>> It's not a secret that LLVM is getting slower and slower release after
>> release, as Rafael discovered/pointed out in March [1] (and probably
>> many others found out). I personally took a shot at profiling LTO on
>> many internal/opensource applications (including clang itself) and
>> noticed that GVN always show in the top-3 passes (and it's generally
>> the pass where we spend most of the time in the middle-end). There are
>> cases (extreme) where 90% of the compile time goes in GVN.
>>
>> Example:
>>
>> ===-------------------------------------------------------------------------===
>>                       ... Pass execution timing report ...
>>
>> ===-------------------------------------------------------------------------===
>>   Total Execution Time: 684.7622 seconds (683.7141 wall clock)
>>
>>    ---User Time---   --System Time--   --User+System--   ---Wall
>> Time---  --- Name ---
>>   130.2558 ( 20.0%)   6.3128 ( 18.7%)  136.5686 ( 19.9%)  137.6145 (
>> 20.1%)  X86 DAG->DAG Instruction Selection
>>   55.4695 (  8.5%)   0.5501 (  1.6%)  56.0196 (  8.2%)  56.1049 (
>> 8.2%)  Function Integration/Inlining
>>   42.3492 (  6.5%)   0.0364 (  0.1%)  42.3856 (  6.2%)  42.8676 (
>> 6.3%)  Global Value Numbering
>>
>> ### Problems in the current GVN
>>
>> There are some issues in the current GVN infrastructure. A
>> non-exhaustive list (I'm pretty sure people have their list of
>> complains, these are the ones I care about the most).
>> * GVN is very slow for large test cases
>> * GVN doesn't do a real analysis, instead it eliminates as it goes, so
>> it's hard to reason about it.
>> * GVN doesn't perform any phi predication, i.e. doesn't know about phi
>> nodes, so later passes have to do some extra work to clean up
>> * There are bugs, e.g. [2] which would require a rewrite of PRE to be
>> fixed.
>>
>> ### NewGVN
>>
>> The new algorithm implements the ideas described in [3] with some
>> engineering optimizations by Dan (for example the set of touched
>> instructions is represented using a Bitvector instead of a set because
>> it's not uncommon for large functions where a predicate change that
>> thousands of instructions need to be changed, and we both measured 30%
>> of the whole pass time spent just modifying the set).
>> The code pretty much maps what the paper describe so I won't try to
>> repeat it here =)
>>
>> Some advantages of NewGVN:
>> * GVN performs a real analysis (which is now part of the pass itself
>> but potentially could be split and used as an utility by other
>> passes). For example, Quentin/Dan pointed out that outlining at the IR
>> level is hard without a proper value numbering analysis.
>> * On large testcases, It's faster than current GVN, even though didn't
>> go through a lot of profiling/optimization lately (I found it up to
>> 50% faster in compile time on some internal benchmarks). There are
>> some places were we can improve. For example, we spend a lot of time
>> inside Simplify* functions. Another considerable chunk of the time is
>> spent inside MemorySSA, but this is going to change once we preserve
>> MemSSA more aggressively.
>> * The code is easier to understand (at least to me)
>> [...]
>>
>> Some current limitations of NewGVN:
>> * NewGVN doesn't do everything that the current GVN does. There are
>> plans to implement missing features and more.
>> * On small testcases NewGVN may be slower as it does some work upfront
>> that the current GVN doesn't do. NewGVN is probably less lazy than it
>> should be, but it didn't matter for us so we didn't consider it a
>> priority. If somebody cares and finds a case where NewGVN
>> substantially regresses, speak up.
>>
>> The initial code can be found here https://reviews.llvm.org/D26224
>> The current patch includes only the "core" GVN algorithm, i.e. the
>> expression framework/the logic to perform congruence finding. Other
>> pieces (e.g. PRE will build on top of that).
>>
>> My rough plan is:
>> * Try to get this reviewed/tested and in tree.
>> * Fix miscompiles/bugs (and performance problems as/if they arise).
>> * Build other pieces of the algorithm on top of what we have. My
>> immediate concern will be implementing support for llvm.assume and
>> load coercion. Dan said he will try to find the time to work on the
>> predicate handling (actually, there's already a branch
>> https://github.com/dberlin/llvm-gvn-rewrite/tree/newgvn-predicates).
>>
>> Please let us know what you think. Any feedback/review comment/testing
>> is very appreciated!
>>
>> Thanks!
>>
>> [1] http://lists.llvm.org/pipermail/llvm-dev/2016-March/096488.html
>> [2] https://llvm.org/bugs/show_bug.cgi?id=30692#c11
>> [3] http://dl.acm.org/citation.cfm?id=512536
>>
>> --
>> Davide
>>
>> "There are no solved problems; there are only problems that are more
>> or less solved" -- Henri Poincare
>
>

-- 
Davide

"There are no solved problems; there are only problems that are more
or less solved" -- Henri Poincare