Disable arm-gepopt by default
James Molloy
james at jamesmolloy.co.uk
Tue Apr 21 06:55:03 PDT 2015
See:
http://reviews.llvm.org/D9148
http://reviews.llvm.org/D9149
Cheers,
James
On Tue, 21 Apr 2015 at 12:06 James Molloy <james at jamesmolloy.co.uk> wrote:
> Hi Michael, Gerolf,
>
> The main goal of this optimization was to reduce the number of
> multiply-adds emitted when complex GEPs are used. With regards to SPEC,
> this is particularly prevalent in sjeng and gobmk (although we don't just
> care about SPEC and there were plenty of other places we caught this idiom).
>
> The pass is actively reducing the number of multiply-adds in gobmk:
>
> $ llc engine.ll -O3 -mllvm -aarch64-gep-opt=false -o - |& grep madd | wc -l
> 297
>
> $ llc engine.ll -O3 -mllvm -aarch64-gep-opt=true -o - |& grep madd | wc -l
> 158
>
> So it demonstrably is making at least some code better.
>
> I was able to reproduce your twofish example. Twofish, like most crypto
> algorithms, has a pattern repeated many times so the problem could just be
> down to one pattern producing one more instruction, which it turns out is
> exactly the case.
>
> The problem is that with gep-opt enabled, we're associating shifts and
> constants the other way around.
>
> With gep-opt disabled (synthetic code example):
> add x0, x0, #72
> ldr x0, [x1, x0, lsl #2]
>
> With gep-opt enabled:
> add x0, x1, x0, lsl #2
> ldr x0, [x0, #72]
>
> This is actually fine, until another expression reuses a different shift
> with the same offset:
>
> With gep-opt disabled:
> add x0, x0, #72
> ldr x0, [x1, x0, lsl #2]
> ldr x2, [x1, x0, lsl #3]
>
> With gep-opt enabled:
> add x0, x1, x0, lsl #2
> add x2, x1, x0, lsl #3
> ldr x0, [x0, #72]
> ldr x2, [x0, #72]
>
> Amusingly, I saw the opposite behaviour on AArch32 on a different testcase
> and came to the conclusion that expressions with correlated immediate
> offsets and identical shift amounts were more likely than expressions with
> correlated shift amounts and identical immediate offsets, but that's a
> different story.
>
> Anyway, the reason this is happening is because we're not running
> reassociate after GEP expansion. If we run reassociate it all goes away,
> it's a one-line fix.
>
> The two testcases that were disabled - well, they weren't really disabled,
> but the behaviour was changed for them. I put that down to a bug in the
> SeparateConstOffsetFromGEP pass, where it tries to separate a GEP with only
> one variadic offset:
>
> GEP i8* %base, i32 %offset
>
> Nothing beneficial can come of that, so I've got, again, a single line
> patch to fix this. I'll push both patches upstream just as soon as I verify
> them.
>
> So in summary, I don't see any need to disable this pass. It's doing good
> work, it just has one or two bugs.
>
> Cheers,
>
> James
>
> On Tue, 21 Apr 2015 at 05:27 Michael Zolotukhin <mzolotukhin at apple.com>
> wrote:
>
>> Hi Tim and others,
>>
>> So, the entire discussion is moving around the assumption that this pass
>> is beneficial in some cases, but we don’t have such data. Please note that
>> I don’t want to say that the assumption is wrong, but as far as I
>> understand, only Hao observed some gains. Thus, until we hear from him, we
>> have real slow downs versus presumable gains.
>>
>> As for the looking into the code later - I’m not an expert in this area,
>> and I don’t even have a motivational example for this optimization, so I’d
>> barely be a right person for it. I’d be glad to assist with collecting some
>> additional data/running tests if that might help though.
>>
>> Thanks,
>> Michael
>>
>>
>> > On Apr 20, 2015, at 8:40 PM, Tim Northover <t.p.northover at gmail.com>
>> wrote:
>> >
>> > On 20 April 2015 at 20:34, Gerolf Hoflehner <ghoflehner at apple.com>
>> wrote:
>> >> It shouldn’t come across as ‘extreme’.
>> >
>> > The "extreme" part wasn't so much the tentative disabling as "Could
>> > you and/or Hao take this on after the code is disabled/backed out?". I
>> > realise it wasn't an ultimatum, but it could very easily turn into
>> > "we're washing our hands of this", which I'd object to.
>> >
>> > I assume the pass was committed with some kind of metric that it
>> > improved, and now we have a counter-example. It's down to everyone to
>> > decide what the best way to proceed is.
>> >
>> > Cheers.
>> >
>> > Tim.
>> >
>> > _______________________________________________
>> > llvm-commits mailing list
>> > llvm-commits at cs.uiuc.edu
>> > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>
>>
>> _______________________________________________
>> llvm-commits mailing list
>> llvm-commits at cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20150421/2b9c7a08/attachment.html>
More information about the llvm-commits
mailing list