[LLVMdev] [RFC] AArch64: Should we disable GlobalMerge?
Kristof Beyls
kristof.beyls at arm.com
Thu Feb 26 02:33:55 PST 2015
Hi Ahmed,
Did you run these experiments on a platform with a linker that makes
use of the AArch64CollectLOH-pass-produced information?
I'm guessing that the AArch64CollectLOH-pass information and a linker
that makes use of that information could affect the profitability of
the GlobalMerge pass?
Thanks,
Kristof
> -----Original Message-----
> From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu]
> On Behalf Of Ahmed Bougacha
> Sent: 26 February 2015 01:13
> To: LLVM Dev
> Subject: Re: [LLVMdev] [RFC] AArch64: Should we disable GlobalMerge?
>
> With the numbers!
> -Ahmed
>
>
> On Wed, Feb 25, 2015 at 4:57 PM, Ahmed Bougacha
> <ahmed.bougacha at gmail.com> wrote:
> > Hi all,
> >
> > I've started looking at the GlobalMerge pass, enabled by default on
> > ARM and AArch64. I think we should reconsider that, at least for
> > AArch64.
> >
> > As is, the pass just merges all globals together, in groups of 4KB
> > (AArch64, 128B on ARM).
> >
> > At the time it was enabled, the general thinking was "it's almost
> > free, it doesn't affect performance much, we might as well use it".
> > Now, it's preventing some link-time optimizations (as acknowledged in
> > one of the FIXMEs).
> >
> >
> > -- Performance impact
> > Overall, it isn't that profitable on the test-suite, and actually
> > degrades performance on a lot of other - "non-benchmark" - projects I
> > tried (where the main reason to use a global is file- or function-
> > static variables, only accessed through a single getter function).
> >
> > Across several runs on the entire test-suite, when disabling the pass,
> > I measured:
> > without LTO, a -0.19% geomean improvement with LTO, a +0.11% geomean
> > regression.
> >
> > As for just SPEC2006, there are two big regressions: 400.perlbench
> > (10.6% w/ LTO, 2.7% w/o) and 471.omnetpp (2.3% w/, 3.9% w/o).
> >
> > Numbers are attached.
> >
> >
> > -- A way forward
> > One obvious way to improve it is: look at uses of globals, and try to
> > form sets of globals commonly used together. The tricky part is to
> > define heuristics for "commonly". Also, the pass then becomes much
> > more expensive. I'm currently looking into improving it, and will
> > report if I come up with a good solution. But this shouldn't stop us
> > from disabling it, for now.
> >
> > Also, the pass seems like a good candidate for
> > -O3/CodeGenOpt::Aggressive. However, the latter is implied by LTO,
> > which IMO shouldn't include these not-always-profitable optimizations.
> > That's another problem though.
> >
> >
> >
> > Right now, I think we should disable the pass by default, until it's
> > deemed profitable enough.
> >
> > -Ahmed
More information about the llvm-dev
mailing list