[llvm-dev] [GlobalISel][AArch64] Toward flipping the switch for O0: Please give it a try!

Mon May 29 01:06:34 PDT 2017

Thanks Quentin, it's in progress now, I'll let you know how it goes.

Cheers,
Diana

On 27 May 2017 at 03:36, Quentin Colombet <qcolombet at apple.com> wrote:
> Hi Kristof,
>
> I’ve pushed the localizer in r304051 and added it in the AArch64 O0 pipeline
> in r304052.
>
> I let Diana investigate the seg fault she was seeing.
>
> @Diana, let me know if you need help.
>
> Cheers,
> -Quentin
>
> On May 25, 2017, at 1:53 PM, Quentin Colombet via llvm-dev
> <llvm-dev at lists.llvm.org> wrote:
>
> Hi Kristof,
>
> On May 25, 2017, at 2:09 AM, Kristof Beyls <kristof.beyls at arm.com> wrote:
>
>
> On 24 May 2017, at 22:01, Quentin Colombet <qcolombet at apple.com> wrote:
>
> Hi Kristof,
>
> Thanks for going back so fast!
>
> On May 24, 2017, at 12:57 PM, Kristof Beyls <kristof.beyls at arm.com> wrote:
>
>
> On 24 May 2017, at 19:31, Quentin Colombet <qcolombet at apple.com> wrote:
>
> Hi Kristof,
>
> Thanks for the measurements.
>
> On May 24, 2017, at 6:00 AM, Kristof Beyls <kristof.beyls at arm.com> wrote:
>
>
> - Comparing against -O0 without globalisel but with the above regalloc
> options: 5.6% performance drop, 1% code size drop.
>
> In summary, the measurements indicate some good improvements.
> I also haven't measure the impact on compile time.
>
>
> Do you have a mean to make this measurement?
> Ahmed did a bunch of compile time measurements on our side and I wanted to
> see if I need to put him on the hook again :).
>
>
> I did a quick setup with CTMark (part of the test-suite). I ran each of
> * '-O0 -g',
> * '-O0 -g -mllvm -global-isel=true -mllvm -global-isel-abort=0', and
> * '-O0 -g -mllvm -global-isel=true -mllvm -global-isel-abort=0 -mllvm
> -optimize-regalloc -mllvm -regalloc=greedy'
> 5 times, cross-compiling from X86 to AArch64, and took the median measured
> compile times.
> In summary, I see GlobalISel having a compile time that's 3.5% higher than
> the current -O0 default.
> With enabling the greedy register allocator, this increases to 28%.
> 28% is probably too high?
>
>
> I think it is yes.
> I have attached a quick hack to the greedy allocator to feature a fast mode.
> Could you give it a try?
>
> To enable the fast mode, please use (-mllvm) -regalloc-greedy-fast=true
> (default is false).
>
>
> I'm afraid it doesn't seem to save much compile time. On geomean, I see
> about 26% compile time increase against the current -O0 default (compared to
> 28% increase for regalloc greedy without your patch).
>
>
> Interesting, I guess a lot of time is spent in the coalescer. Could you give
> a try with -join-liveintervals=false?
>
>
> With adding -join-liveintervals=false, I see the compile time increase going
> up to 28% again.
>
>
> Heh, I am mildly surprised we hand much more live-ranges to the allocator
> when we do that.
>
>
>
> Do you know where the time is spent (-time-passes)?
>
>
> I'm afraid I won't have time to have a closer look in the next couple of
> days - I don't know where the time is spent at the moment.
>
>
> Fair enough, will investigate later.
>
>
>
> Anyhow, fixing all of those, although this is I think the right approach,
> will take time, so we can go with the localizer.
>
>
> Right, I don't understand the register allocator well enough to know if that
> compile time overhead can be fixed, while still getting the necessary
> codegen benefits the greedy allocator gives.
> Is there any specific help you're looking for with getting the localizer
> work well enough for production use?
>
>
> I’ll clean-up the WIP patch for the localizer, then you guys can fix the bug
> that you found.
>
> I’ll do that tomorrow.
>
> Cheers,
> -Quentin
>
>
> Thanks,
>
> Kristof
>
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>