[PATCH][Peephole] Rewrite copies to avoid cross register banks copies.

Quentin Colombet qcolombet at apple.com
Fri Oct 4 12:59:21 PDT 2013


Good point.

Let me gather some numbers.

-Quentin

On Oct 4, 2013, at 12:50 PM, Evan Cheng <evan.cheng at apple.com> wrote:

> The patch looks fine to me. Is there any measurable compile time impact?
> 
> Evan
> 
> On Sep 9, 2013, at 9:20 AM, Quentin Colombet <qcolombet at apple.com> wrote:
> 
>> Ping^2?
>> 
>> -Quentin
>> 
>> On Sep 3, 2013, at 9:41 AM, Quentin Colombet <qcolombet at apple.com> wrote:
>> 
>>> Ping?
>>> -Quentin
>>> 
>>> On Aug 28, 2013, at 5:29 PM, Quentin Colombet <qcolombet at apple.com> wrote:
>>> 
>>>> Hi Evan,
>>>> 
>>>> Eric suggested that you may be the right person to review this patch.
>>>> Could you have a look, please?
>>>> 
>>>> This fixes <rdar://problem/14742333>.
>>>> 
>>>> Thanks,
>>>> -Quentin
>>>> <PeepholeCopy.svndiff>
>>>> 
>>>> On Aug 28, 2013, at 5:25 PM, Quentin Colombet <qcolombet at apple.com> wrote:
>>>> 
>>>>> 
>>>>> On Aug 28, 2013, at 4:43 PM, Eric Christopher <echristo at gmail.com> wrote:
>>>>> 
>>>>>>> 
>>>>>>> The patch is not triggered by any of the tests that are in the llvm test
>>>>>>> suite (tested on both ARM and X86).
>>>>>>> 
>>>>>> 
>>>>>> *nod* I figured, otherwise you'd have said it. :)
>>>>>> 
>>>>>>> Like I said in my original email, I could rewrite/eliminate more
>>>>>>> aggressively copies (and in that case it does kick in in the llvm test
>>>>>>> suite) but then, the register allocator does in a few cases a poor
>>>>>>> coalescing/live-range splitting job.
>>>>>>> 
>>>>>> 
>>>>>> *nod*
>>>>>> 
>>>>>>> That said, as it is, when the patch kicks in, we observed with our internal
>>>>>>> tests: no regressions and up to 25% runtime speed-up.
>>>>>>> 
>>>>>> 
>>>>>> Pretty nice. Can you give a general idea of the type of code that this
>>>>>> is triggering on?
>>>>> Basically it was some code using a tone of intrinsics, in particular mixing vector, integer, and floating point code.
>>>>> For isel, this means, tone of bitcast, vector insert, and vector extract, across different basic blocks.
>>>>> 
>>>>>> 
>>>>>>> Perhaps an
>>>>>>> example of what you're optimizing as a comment in the code?
>>>>>>> 
>>>>>>> I have added one in the file header. What do you think?
>>>>>>> 
>>>>>> 
>>>>>> Helps. I was hoping for more code details, but the comment provides
>>>>>> some helpful bit.
>>>>> Do you think I should put more comments?
>>>>> 
>>>>>> 
>>>>>>> I have moved the assignment inside the loop.
>>>>>>> Is it better?
>>>>>>> (I do like predicated code!)
>>>>>>> 
>>>>>> 
>>>>>> Looks better to me from a readability perspective.
>>>>>> 
>>>>>>> 
>>>>>>> I'm not necessarily comfortable approving, but the code looks reasonable.
>>>>>>> 
>>>>>>> What do you miss to be comfortable?
>>>>>>> Could you point me someone that may be comfortable to approve this, please?
>>>>>>> (So that I can through him under the bus!)
>>>>>>> 
>>>>>> 
>>>>>> Ultimately Jakob or Evan. It looks pretty simple so unless you get an
>>>>>> objection by tomorrow morning I guess submit?
>>>>> I will do that.
>>>>> 
>>>>> Thanks,
>>>>> Quentin
>>>>>> 
>>>>>> -eric
>>>>>> 
>>>>>>> Thanks again.
>>>>>>> 
>>>>>>> -Quentin
>>>>>>> 
>>>>>>> 
>>>>>>> -eric
>>>>>>> 
>>>>>>> 
>>>>>>> On Fri, Aug 23, 2013 at 1:04 PM, Quentin Colombet <qcolombet at apple.com>
>>>>>>> wrote:
>>>>>>> 
>>>>>>> Hi,
>>>>>>> 
>>>>>>> Here is a patch to rewrite copies to avoid cross register banks copies when
>>>>>>> possible.
>>>>>>> Thanks for your reviews.
>>>>>>> 
>>>>>>> ** Context **
>>>>>>> 
>>>>>>> By definition copies across register bank are not coalescable. Still, it may
>>>>>>> be possible to get rid of such a copy when the value is available in another
>>>>>>> register of the same register file.
>>>>>>> Consider the following example, where capital and lower letters denote
>>>>>>> different register file:
>>>>>>> b = copy A <-- cross-bank copy
>>>>>>>>>>>>>> C = copy b <-- cross-bank copy
>>>>>>> 
>>>>>>> This could have been optimized this way:
>>>>>>> b = copy A  <-- cross-bank copy
>>>>>>>>>>>>>> C = copy A <-- same-bank copy
>>>>>>> 
>>>>>>> Note: b and C's definitions may be in different basic blocks.
>>>>>>> 
>>>>>>> 
>>>>>>> ** Proposed Solution **
>>>>>>> 
>>>>>>> Add a peephole optimization that looks through a chain of copies leading to
>>>>>>> a cross-bank copy and reuses a source that is on the same register file if
>>>>>>> available.
>>>>>>> 
>>>>>>> This solution could also be used to get rid of some copies (e.g., A could
>>>>>>> have been used instead of C). However, we do not do so because:
>>>>>>> - It may over constrain the coloring of the source register for coalescing.
>>>>>>> - The register allocator may not be able to find a nice split point for the
>>>>>>> longer live-range, leading to more spill.
>>>>>>> 
>>>>>>> 
>>>>>>> ** Testcase?! **
>>>>>>> 
>>>>>>> This patch does not include a test case, because it is very difficult to
>>>>>>> reproduce this behavior with a reasonably small input.
>>>>>>> 
>>>>>>> Cheers,
>>>>>>> -Quentin
>>>>>>> 
>>>>>>> _______________________________________________
>>>>>>> llvm-commits mailing list
>>>>>>> llvm-commits at cs.uiuc.edu
>>>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>>> 
>>>> _______________________________________________
>>>> llvm-commits mailing list
>>>> llvm-commits at cs.uiuc.edu
>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>> 
>>> _______________________________________________
>>> llvm-commits mailing list
>>> llvm-commits at cs.uiuc.edu
>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20131004/b74b1412/attachment.html>


More information about the llvm-commits mailing list