[PATCH] PR18825 - New RegisterCoalescer subtarget hook
Chris Bieneman
cbieneman at apple.com
Mon Jul 14 17:00:38 PDT 2014
Thanks for the review Quentin. It shouldn't be hard to move the DenseMap into the MachineFunctionInfo, I'll rework the patch, and I will provide test cases for both of the cases you mentioned.
To close the loop on David's earlier question about the llvm performance suite. This patch has no impact on the generated code in any of our tests, so it will not alter performance.
Thanks,
-Chris
On Jul 14, 2014, at 3:56 PM, Quentin Colombet <qcolombet at apple.com> wrote:
> Hi Chris,
>
> The current patch LGTM, though the DenseMap in the Subtarget is not pretty.
> My main concern is with big module, where the map will keep a lot of useless information since we have no way to know when it is fine to clear it.
>
> I was thinking moving that map into the XXXMachineFunctionInfo, not sure how feasible is that.
>
> Regarding the final patch, assuming no further feedbacks come, I would expect to have a new test case for PR18825 and the update test for CodeGen/ARM/vldm-sched-a9.ll.
>
> Thanks,
> -Quentin
>
>> On Jul 14, 2014, at 1:23 PM, Chris Bieneman <cbieneman at apple.com> wrote:
>>
>> This morning I built SPEC with and without my patch and found no codegen differences.
>>
>> -Chris
>>
>> On Jul 11, 2014, at 5:19 PM, Chris Bieneman <cbieneman at apple.com> wrote:
>>
>>> David,
>>>
>>> I agree completely. I'll look into any codegen differences in our test suite, and in SPEC.
>>>
>>> -Chris
>>>
>>>
>>>
>>>> On Jul 11, 2014, at 4:26 PM, David Peixotto <dpeixott at codeaurora.org> wrote:
>>>>
>>>> Hi Chris,
>>>>
>>>> I agree that there is no simple solution. I am not opposed to your heuristic approach. I asked for stats or perf data because I know that the bug rarely happens, but I can't tell from reading the code how often the heuristic will hit. As you mentioned coalescing can have positive or negative performance impact and it would be good to understand if this change has any impact on codegen or performance for some larger benchmarks.
>>>>
>>>> That's my opinion anyway :)
>>>>
>>>> -David
>>>>
>>>> -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation
>>>>
>>>>> -----Original Message-----
>>>>> From: Chris Bieneman [mailto:cbieneman at apple.com]
>>>>> Sent: Friday, July 11, 2014 2:58 PM
>>>>> To: David Peixotto
>>>>> Cc: llvm-commits
>>>>> Subject: Re: [PATCH] PR18825 - New RegisterCoalescer subtarget hook
>>>>>
>>>>> Hi David,
>>>>>
>>>>> I haven’t run SPEC, but I have run all of LLVM’s in tree tests. Among
>>>>> those I haven’t found any codegen differences except the one test that I
>>>>> called out in my original email, and I suspect that would be a performance
>>>>> improvement.
>>>>>
>>>>> That said, there are certainly cases where blocking coalescing can
>>>>> generate worse code. I’ve been emailing back and forth with Quentin about
>>>>> potential longer-term solutions to this problem and there really is no
>>>>> simple solution.
>>>>>
>>>>> -Chris
>>>>>
>>>>>> On Jul 11, 2014, at 2:45 PM, David Peixotto <dpeixott at codeaurora.org>
>>>>> wrote:
>>>>>>
>>>>>> Hi Chris,
>>>>>>
>>>>>> It would be great to get this bug fixed. I have a general question about
>>>>> the heuristic.
>>>>>>
>>>>>> I believe the case we need to avoid in the bug is quite rare (running
>>>>> out of registers), but the heuristic will prevent colaescing in other
>>>>> cases as well. Do you have any statistics for how many coalescings are
>>>>> prevented by the heuristic and any performance data for spec or llvm test
>>>>> suite?
>>>>>>
>>>>>> Thanks,
>>>>>> David
>>>>>>
>>>>>> -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
>>>>>> hosted by The Linux Foundation
>>>>>>
>>>>>>> -----Original Message-----
>>>>>>> From: llvm-commits-bounces at cs.uiuc.edu [mailto:llvm-commits-
>>>>>>> bounces at cs.uiuc.edu] On Behalf Of Chris Bieneman
>>>>>>> Sent: Friday, July 11, 2014 1:45 PM
>>>>>>> To: llvm-commits
>>>>>>> Subject: [PATCH] PR18825 - New RegisterCoalescer subtarget hook
>>>>>>>
>>>>>>> The attached patch adds a new subtarget hook to the
>>>>>>> RegisterCoalescer, and an ARM implementation. The ARM implementation
>>>>>>> works around PR18825, and also better code in cases with a large number
>>>>> of NEON vectors in use.
>>>>>>>
>>>>>>> The basic problem is that the coalescer is very aggressive at
>>>>>>> propagating constraints on the register classes, and the register
>>>>>>> allocator doesn’t know how to split sub-registers later to recover.
>>>>>>> This patch provides an escape valve for targets that encounter this
>>>>> problem to limit coalescing.
>>>>>>>
>>>>>>> A few other notes about this patch:
>>>>>>>
>>>>>>> (1) In code that triggers the heuristic (using a lot of large vector
>>>>>>> registers) this code can improve compile time substantially, and in
>>>>>>> other code it has no adverse impact.
>>>>>>>
>>>>>>> (2) This code also breaks one of the tests in our test suite
>>>>>>> (CodeGen/ARM/vldm-sched-a9.ll). This test is designed to test NEON
>>>>>>> vector spilling, and with this patch the test no longer spills. With
>>>>>>> the final patch I will provide a new test case to test vector
>>>>>>> spilling, and this test will be marked XFAIL so that we don’t regress
>>>>>>> this and start spilling here again.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> -Chris
>>>>>>
>>>>
>>>>
>>
>>
>> _______________________________________________
>> llvm-commits mailing list
>> llvm-commits at cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>
More information about the llvm-commits
mailing list