[llvm-dev] [GlobalISel][AArch64] Toward flipping the switch for O0: Please give it a try!

Quentin Colombet via llvm-dev llvm-dev at lists.llvm.org
Thu May 18 10:09:48 PDT 2017


Hi Diana,

> On May 18, 2017, at 1:15 AM, Diana Picus <diana.picus at linaro.org> wrote:
> 
> On 18 May 2017 at 09:06, Kristof Beyls <Kristof.Beyls at arm.com> wrote:
>> I think Diana found that when enabling r299283, the bootstrap failed with
>> llvm-tblgen segfaulting.
>> So there clearly is some work required there.
> 
> Indeed.
> 
> @Quentin, what is the status of that patch?

Initially it was meant as a prototype to investigate how we could address those issues at O0. I didn’t mean to publish it.

> Have you been working on
> it since then?

No, I haven’t touched it and I honestly didn’t plan to do that.

> Should I investigate the segfault more and send you a
> reproducer?

Depends, do you guys want to pursue in that direction?
My though was that it is probably best to rely on a better reg allocator (basic or greedy) scheme for O0. I believe the only concern may be compile time but it may  just fly (+ I have ideas how to make it better pretty easy). 

> Is this patch the way forward, or do you have other ideas
> for reducing the stack usage?

Better reg alloc scheme for O0 :).

> 
> Thanks,
> Diana
> 
>> 
>> Thanks,
>> 
>> Kristof
>> 
>> 
>> On 16 May 2017, at 14:06, Diana Picus via llvm-dev <llvm-dev at lists.llvm.org>
>> wrote:
>> 
>> Turns out it really is a GlobalISel issue - we eat up too much stack
>> space because all the constants used in the switches are stored on the
>> stack. We need to fix this somehow before changing the default. I'll
>> try to give it a run with Quentin's r299283 on top to see if it helps.
>> 
>> Cheers,
>> Diana
>> 
>> On 15 May 2017 at 09:38, Diana Picus <diana.picus at linaro.org> wrote:
>> 
>> Got another one: https://bugs.llvm.org/show_bug.cgi?id=33036
>> 
>> I'm still investigating whether this is an actual GlobalISel issue or
>> something else (I'll also start a build on a more recent revision to
>> see how that behaves).
>> 
>> On 12 May 2017 at 20:06, Eric Christopher <echristo at gmail.com> wrote:
>> 
>> Agreed. That was probably just luck before :)
>> 
>> -eric
>> 
>> On Fri, May 12, 2017 at 5:22 AM Diana Picus via llvm-dev
>> <llvm-dev at lists.llvm.org> wrote:
>> 
>> 
>> It turns out that can be fixed by adding -lm to the link line, so I
>> will probably convert it into a test-suite bug.
>> 
>> I don't suppose it's crucial to handle the fabs intrinsic nicely at -O0.
>> 
>> On 12 May 2017 at 13:44, Diana Picus <diana.picus at linaro.org> wrote:
>> 
>> Hi,
>> 
>> I ran into a little snag on the test-suite:
>> https://bugs.llvm.org/show_bug.cgi?id=33021
>> It boils down to GlobalISel generating calls to fabs instead of using
>> FABSDr (so we get undefined references).
>> 
>> Cheers,
>> Diana
>> 
>> On 11 May 2017 at 18:40, Quentin Colombet <qcolombet at apple.com> wrote:
>> 
>> Hi Diana,
>> 
>> Thanks for the summary.
>> 
>> On May 11, 2017, at 4:01 AM, Diana Picus <diana.picus at linaro.org>
>> wrote:
>> 
>> Hi all,
>> 
>> I'm still running some validation on this, I'll send an email when
>> it's done. If that goes well I don't have anything against making the
>> switch.
>> 
>> For the record, here's a summary of issues that were deferred for
>> later on (some of which are optimization-ish and we might decide to
>> never do at -O0 at all):
>> * Crash in RegBankSelect for half fp types:
>> https://bugs.llvm.org/show_bug.cgi?id=32560
>> 
>> 
>> I’ll have a look.
>> 
>> * Improving constant placement:
>> http://bugs.llvm.org/show_bug.cgi?id=32561
>> 
>> 
>> I’ve commented in the PR to mention the localizer technic I was playing
>> with, if someone wants to give it a try.
>> 
>> * Fancy switch lowering
>> * Transforming division-by-constant-power-of-2 into right shift
>> 
>> 
>> AFAICT all the other issues that were brought up were fixed (yay!).
>> 
>> Cheers,
>> Diana
>> 
>> 
>> Cheers,
>> -Quentin
>> 
>> 
>> 
>> On 11 May 2017 at 08:44, Kristof Beyls via llvm-dev
>> <llvm-dev at lists.llvm.org> wrote:
>> 
>> 
>> On 10 May 2017, at 17:36, Quentin Colombet <qcolombet at apple.com> wrote:
>> 
>> 
>> MultiSource/Benchmarks/mediabench/mpeg2/mpeg2dec/mpeg2decode (46%):
>> Function
>> Reference_IDCT: Probably due to creating all constants in the entry BB
>> +
>> spilling floating point data through an X register:
>> 
>> FastISel:
>> fadd d0, d1, d0
>> str d0, [sp,#528]
>> GlobalISel:
>> fadd d0, d1, d0
>> fmov x9, d0
>> stur x9, [x29,#-48]
>> 
>> 
>> Good finding, I forgot to do stores in my previous fix. I’ll do them
>> shortly.
>> 
>> 
>> Should be fixed by r302679
>> 
>> 
>> Thanks Quentin,
>> 
>> That reduces the slow-down when enabling globalisel at -O0 from 13% (on
>> r302453)  to 9.5% (on r302679) in my experiments.
>> The code size increase also reduces from just over 3% to 2.8%.
>> 
>> Kristof
>> 
>> 
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>> 
>> 
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>> 
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>> 
>> 



More information about the llvm-dev mailing list