[LLVMdev] [RFC] AArch64: Should we disable GlobalMerge?

Fri Feb 27 14:03:51 PST 2015

On Thu, Feb 26, 2015 at 4:09 AM, Renato Golin <renato.golin at linaro.org> wrote:
> On 26 February 2015 at 00:57, Ahmed Bougacha <ahmed.bougacha at gmail.com> wrote:
>> -- A way forward
>> One obvious way to improve it is: look at uses of globals, and try to
>> form sets of globals commonly used together.  The tricky part is to
>> define heuristics for "commonly".  Also, the pass then becomes much
>> more expensive.  I'm currently looking into improving it, and will
>> report if I come up with a good solution.  But this shouldn't stop us
>> from disabling it, for now.
>
> Hi Ahmed,
>
> Before "moving forward", it would be good to understand what in
> GlobalMerge is impacting what in LTO.
>
> With LTO becoming more important nowadays, I agree we have to balance
> the compiler optimisations to work well with it, but by turning things
> off we might be impacting unknown code in an unknown way.
>
> We'll never know how unknown code behaves, but if at least we
> understand what of GM affects what of LTO, then people using unknown
> code will have a more informed view on what to disable, when.

Fair enough.  First, a couple things to note:
- GlobalMerge runs as a pre-ISel pass, so very late in the mid-level pipeline.
- GlobalMerge (by default) only looks at internal globals.

Internal globals come up with file- or function- static variables.  In
LTO, all module-level globals are internalized, and are eligible for
merging.

So, we can generally group global usage into a few categories:
- a function that uses a local static variable (say, llvm::outs())
- a function that uses several globals at once.  For instance,
400.perlbench's interpreter has a bunch of those, as does its
parser/lexer.
- a set of functions that share a few common globals (say, an inlined
reference to a function-local static variable), but otherwise each use
several other globals (again, perl's interpreter).

GlobalMerge is only ever a win if we are able to share base pointers.
This requires:
- several globals being referenced
- the references being close enough (otherwise we'll just
rematerialize the base, or worse, increase register pressure)

There is one obvious special case for the first requirement:  if a
global is only ever used alone, there's no point in merging it
anywhere. (this is improvement #1).
Once we can determine the set of used globals for each function, we
can try to merge those sets only. (#2)

We can try to better handle the second requirement, by having some
more precise metric for distance between uses.  One trivially
available such metric is grouping used sets by parent basic-block
rather than function (#3).

Experimentally, #1 catches a lot of the singleton-ish globals out
there, which is the majority in some of the more "modern" code I've
looked at.  It leaves the legitimate merging in perl alone.

#2 (and even moreso #3) is actually too aggressive, and doesn't catch
a lot/most of the profitable cases in perl.  Consider:
- a "g_log" global (or, say, LLVM's outs/dbgs/errs), used pretty much everywhere
- several sets of globals, used in different parts of the program
(perl's interpreter vs parser)

You'd pick one of the latter sets, and add the "g_log" global to it.
Now you made it more expensive everywhere you use "g_log", without the
benefit of base sharing in all the other functions.

So you need to be smart when picking the sets.  You can combine some
of them, using some cost metric.  (#4)  This is where it gets
complicated.

I'll try measuring some of those, see what happens on benchmarks.
Again, that shouldn't stop us from enabling GlobalMerge less often.
Hopefully it's clear that the pass isn't always a win, so -O3 should
be OK.  I'm less comfortable with disabling it on Darwin only, but
that seems like the obvious next step.

Thanks for the feedback!

-Ahmed

> cheers,
> --renato