[PATCH] Add experimental PBQP support

Mon Sep 8 07:32:58 PDT 2014

From: David Blaikie [mailto:dblaikie at gmail.com] 
Sent: 08 September 2014 04:53
To: Arnaud De Grandmaison
Cc: Tim Northover; llvm-commits at cs.uiuc.edu; Lang Hames
Subject: Re: [PATCH] Add experimental PBQP support

On Sat, Sep 6, 2014 at 2:01 PM, Arnaud A. de Grandmaison <arnaud.degrandmaison at arm.com> wrote:

From: David Blaikie [mailto:dblaikie at gmail.com] 
Sent: 06 September 2014 21:40
To: Arnaud De Grandmaison
Cc: Tim Northover; llvm-commits at cs.uiuc.edu; Lang Hames
Subject: Re: [PATCH] Add experimental PBQP support

On Sat, Sep 6, 2014 at 8:30 AM, Arnaud A. de Grandmaison <arnaud.degrandmaison at arm.com> wrote:

Writing tests for a register allocator is not an easy task, as the set of all valid allocation is quite large, and can be equally good. What I have seen with the other allocators is that most testcases correspond to specific issues found in the allocator. My plan was to have an initial commit (this patch, with no real test), and then add testcases with subsequent commits as they improve specific areas of the allocation.

so this change itself doesn't add any improvements, just lays the foundation for improvements to come?

Correct. I see this patch as a foundation for improvements to come.

Great - perhaps you could commit the small test case you've added here ahead of time (to demonstrate that it passes without these changes), just adding more test coverage.

The test is merely a sanity check that the’ –aarch64-pbqp’ option exists and produce something sane. A testcase committed ahead of time would be a duplicate of r217057, a sanity check Lang added some time ago. 

 This patch only uses the existing infrastructure as is,  and was necessary to run a wide range of benchmarks and diagnose where improvements should be made.

Not sure I follow here - according to you & Lang the PBQP allocator already works on these architectures. How does this patch help you diagnose where improvements are to be made?

The PBQP works in the sense that the generate code is functionally correct, i.e there is no miscompilation --- no bogus register allocation. That’s a required preliminary before any performance improvement can take place. With this patch, we have everything in place to be able to compare 2 feature-wise similar versions of llvm for the AArch64/A57:

- LLVM with greedy + A57 FPLoadBalancing pass

- LLVM with PBQP + A57 target specific constraints

Both provide the same high level features, but with different implementations.

That said, if it's a natural precursor/lays the foundation for the ability to add more custom logic to the PBQP allocator for these architectures - great, let's go for it! (though I'd suggest you wait for Lang's OK here - I'm not nearly familiar enough with this stuff, unfortunately - just trying to understand the high level nature of the change you're proposing, because i'm curious)

It lays the foundations for the real work to take place.

I will definitely wait for Lang’s OK for the PBQP aspects of the patch, and Tim’s green light for the AArch64 aspects !

I already have a few on my list, the first one being improving how the different costs are set and relate together (allocation cost, interference cost & spill weight) --- and this will require some modification in the generic infrastructure and a backend with extra constraints to see the effects.

From: David Blaikie [mailto:dblaikie at gmail.com] 
Sent: 06 September 2014 17:17
To: Arnaud De Grandmaison
Cc: Tim Northover; llvm-commits at cs.uiuc.edu; Lang Hames
Subject: RE: [PATCH] Add experimental PBQP support

On Sep 6, 2014 8:08 AM, "Arnaud A. de Grandmaison" <arnaud.degrandmaison at arm.com> wrote:
>
> Hi Dave & Lang,
>
>  
>
> The AArch64 does not require extra constraints for the PBQP to work, but the AArch64/A57 benefits from setting additional constraints. On the A57, some sequence of operations will execute faster if some of their operands stays in even or odd registers. The Arch64FPLoadBalancing pass has been added to do some optimization there by permuting registers in the straight forward cases, whereas this can be solved generally and elegantly with the PBQP at register allocation time.

Awesome - thanks for the explanation.

Are the improvements separable into patches per specific improvement (with corresponding tests for each)?

>
>  
>
> Cheers,
>
> Arnaud
>
>  
>
> From: Lang Hames [mailto:lhames at gmail.com] 
> Sent: 06 September 2014 06:14
> To: David Blaikie
> Cc: Arnaud De Grandmaison; llvm-commits at cs.uiuc.edu; Tim Northover
> Subject: Re: [PATCH] Add experimental PBQP support
>
>  
>
> Hi Dave,
>
>  
>
> Out-of-the-box PBQP knows about the standard constraints that CodeGen models. Any Target that works with the standard allocators (E.g. greedy) should also work with PBQP. I believe Arnaud's patch is an optimisation. (Arnaud - please correct me if I'm wrong and AArch64 did require extra constraints, but I don't think it should?)
>
>  
>
> - Lang.
>
>  
>
> On Fri, Sep 5, 2014 at 3:45 PM, David Blaikie <dblaikie at gmail.com> wrote:
>
> This'll probably show how little I know about register allocation - but I thought Lang was telling me the other day that PBQP is essentially a drop/opt in for any architecture without having specific code for it (learning about the register set from the tablegen files and that was all it needed).
>
> Is that the case? Is the extra code in your patch then tuning, essentially - making PBQP better than the baseline table-driven PBQP for AArch64/A57? Or is my understanding incorrect?
>
> - David
>
>  
>
> On Fri, Sep 5, 2014 at 1:49 PM, Arnaud A. de Grandmaison <arnaud.degrandmaison at arm.com> wrote:
>>
>> I am currently investigating the benefits the PBQP register allocator could bring to the AArch64/A57.
>>
>>  
>>
>> This patch adds experimental support for PBQP. The PBQP is disabled by default, and can be enabled with the ‘–aarch64-pbqp’ command line option to llc when the cortex-a57 is in use.
>>
>>  
>>
>> I thought it would be a good thing to upstream this patch, as some other people in the community could be interested in experimenting with this allocator.
>>
>>  
>>
>> It passes all the tests (LNT, spec, …), but the performance of the generated code is not optimal yet. Expect some more patches in the coming days to improve the performance.
>>
>>  
>>
>> Cheers,
>>
>> --
>>
>> Arnaud A. de Grandmaison
>>
>>  
>>
>>  
>>
>> _______________________________________________
>> llvm-commits mailing list
>> llvm-commits at cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>
>  
>
>  

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20140908/9eb233d8/attachment.html>