[PATCH][Peephole] Rewrite copies to avoid cross register banks copies.
Quentin Colombet
qcolombet at apple.com
Mon Sep 9 09:20:31 PDT 2013
Ping^2?
-Quentin
On Sep 3, 2013, at 9:41 AM, Quentin Colombet <qcolombet at apple.com> wrote:
> Ping?
> -Quentin
>
> On Aug 28, 2013, at 5:29 PM, Quentin Colombet <qcolombet at apple.com> wrote:
>
>> Hi Evan,
>>
>> Eric suggested that you may be the right person to review this patch.
>> Could you have a look, please?
>>
>> This fixes <rdar://problem/14742333>.
>>
>> Thanks,
>> -Quentin
>> <PeepholeCopy.svndiff>
>>
>> On Aug 28, 2013, at 5:25 PM, Quentin Colombet <qcolombet at apple.com> wrote:
>>
>>>
>>> On Aug 28, 2013, at 4:43 PM, Eric Christopher <echristo at gmail.com> wrote:
>>>
>>>>>
>>>>> The patch is not triggered by any of the tests that are in the llvm test
>>>>> suite (tested on both ARM and X86).
>>>>>
>>>>
>>>> *nod* I figured, otherwise you'd have said it. :)
>>>>
>>>>> Like I said in my original email, I could rewrite/eliminate more
>>>>> aggressively copies (and in that case it does kick in in the llvm test
>>>>> suite) but then, the register allocator does in a few cases a poor
>>>>> coalescing/live-range splitting job.
>>>>>
>>>>
>>>> *nod*
>>>>
>>>>> That said, as it is, when the patch kicks in, we observed with our internal
>>>>> tests: no regressions and up to 25% runtime speed-up.
>>>>>
>>>>
>>>> Pretty nice. Can you give a general idea of the type of code that this
>>>> is triggering on?
>>> Basically it was some code using a tone of intrinsics, in particular mixing vector, integer, and floating point code.
>>> For isel, this means, tone of bitcast, vector insert, and vector extract, across different basic blocks.
>>>
>>>>
>>>>> Perhaps an
>>>>> example of what you're optimizing as a comment in the code?
>>>>>
>>>>> I have added one in the file header. What do you think?
>>>>>
>>>>
>>>> Helps. I was hoping for more code details, but the comment provides
>>>> some helpful bit.
>>> Do you think I should put more comments?
>>>
>>>>
>>>>> I have moved the assignment inside the loop.
>>>>> Is it better?
>>>>> (I do like predicated code!)
>>>>>
>>>>
>>>> Looks better to me from a readability perspective.
>>>>
>>>>>
>>>>> I'm not necessarily comfortable approving, but the code looks reasonable.
>>>>>
>>>>> What do you miss to be comfortable?
>>>>> Could you point me someone that may be comfortable to approve this, please?
>>>>> (So that I can through him under the bus!)
>>>>>
>>>>
>>>> Ultimately Jakob or Evan. It looks pretty simple so unless you get an
>>>> objection by tomorrow morning I guess submit?
>>> I will do that.
>>>
>>> Thanks,
>>> Quentin
>>>>
>>>> -eric
>>>>
>>>>> Thanks again.
>>>>>
>>>>> -Quentin
>>>>>
>>>>>
>>>>> -eric
>>>>>
>>>>>
>>>>> On Fri, Aug 23, 2013 at 1:04 PM, Quentin Colombet <qcolombet at apple.com>
>>>>> wrote:
>>>>>
>>>>> Hi,
>>>>>
>>>>> Here is a patch to rewrite copies to avoid cross register banks copies when
>>>>> possible.
>>>>> Thanks for your reviews.
>>>>>
>>>>> ** Context **
>>>>>
>>>>> By definition copies across register bank are not coalescable. Still, it may
>>>>> be possible to get rid of such a copy when the value is available in another
>>>>> register of the same register file.
>>>>> Consider the following example, where capital and lower letters denote
>>>>> different register file:
>>>>> b = copy A <-- cross-bank copy
>>>>> …
>>>>> C = copy b <-- cross-bank copy
>>>>>
>>>>> This could have been optimized this way:
>>>>> b = copy A <-- cross-bank copy
>>>>> …
>>>>> C = copy A <-- same-bank copy
>>>>>
>>>>> Note: b and C's definitions may be in different basic blocks.
>>>>>
>>>>>
>>>>> ** Proposed Solution **
>>>>>
>>>>> Add a peephole optimization that looks through a chain of copies leading to
>>>>> a cross-bank copy and reuses a source that is on the same register file if
>>>>> available.
>>>>>
>>>>> This solution could also be used to get rid of some copies (e.g., A could
>>>>> have been used instead of C). However, we do not do so because:
>>>>> - It may over constrain the coloring of the source register for coalescing.
>>>>> - The register allocator may not be able to find a nice split point for the
>>>>> longer live-range, leading to more spill.
>>>>>
>>>>>
>>>>> ** Testcase?! **
>>>>>
>>>>> This patch does not include a test case, because it is very difficult to
>>>>> reproduce this behavior with a reasonably small input.
>>>>>
>>>>> Cheers,
>>>>> -Quentin
>>>>>
>>>>> _______________________________________________
>>>>> llvm-commits mailing list
>>>>> llvm-commits at cs.uiuc.edu
>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>
>> _______________________________________________
>> llvm-commits mailing list
>> llvm-commits at cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20130909/f5332bd9/attachment.html>
More information about the llvm-commits
mailing list