[PATCH] PR18825 - New RegisterCoalescer subtarget hook

Eric Christopher echristo at gmail.com
Wed Jul 16 13:18:01 PDT 2014


Haven't looked at the, but feel free to post-review them since the
idea is already ok and it should be a pretty direct translation.

-eric

On Wed, Jul 16, 2014 at 1:15 PM, Chris Bieneman <beanz at apple.com> wrote:
> Here are updated patches to move the hook from the subtarget to the
> TargetRegisterInfo.
>
> There are no functional differences from what is already committed.
>
> -Chris
>
>
> On Jul 16, 2014, at 10:29 AM, Eric Christopher <echristo at gmail.com> wrote:
>
> On Wed, Jul 16, 2014 at 10:28 AM, Jim Grosbach <grosbach at apple.com> wrote:
>
>
> On Jul 15, 2014, at 7:42 PM, Eric Christopher <echristo at gmail.com> wrote:
>
> On Tue, Jul 15, 2014 at 6:55 PM, Chris Bieneman <cbieneman at apple.com> wrote:
>
>
> On Jul 15, 2014, at 4:16 PM, Eric Christopher <echristo at gmail.com> wrote:
>
> Couple of small comments:
>
> +  unsigned SizeMultiplier = MBB->size()/100;
>
> Mmm.. random numbers. Why 100? Why not 101? How about a comment :)
>
>
> I can add a comment. The basic idea is that the heuristic restricts how many
> registers which meet certain criteria (large, sub-register indexed, …) get
> coalesced in a given basic block. It seems logical that the larger the basic
> block the more coalescing you should allow. Most of the tests in the
> llvm-test suite don’t hit this heuristic at all because they just don’t
> utilize many enough large registers. Unfortunately that means this really is
> a bit of a magical number.
>
>
> Yeah, it's nice to have it be some sort of documented constant rather
> than ... well, an undocumented constant. How'd you come up with the
> number btw?
>
>
>
> Otherwise... on the subtarget. Why there? Why not on the
> TargetRegisterInfo? Seems to make just as much sense for this use and
> TargetRegisterInfo already has a subtarget if you wanted to make cpu
> specific queries there.
>
> Thoughts?
>
>
> I hadn’t really considered putting it on the TargetRegisterInfo. When
> Quentin and I discussed the patch off-list we discussed it as a subtarget
> hook, so that was how I implemented it. In retrospect I still like it in the
> subtarget because the heuristic feels to me like it is a property of the
> subtarget, but it could be implemented in the TargetRegisterInfo.
>
>
> Conveniently TargetRegisterInfo is owned by each subtarget so it's
> like a subtarget hook :)
>
>
> Either seems fairly reasonable to me. I slightly prefer TargetRegisterInfo
> since the coalescer is already looking to that. So as long as that’s
> straightforward to implement, I’d go that direction.
>
>
> Agreed. Thanks.
>
> -eric
>
> -Jim
>
>
> I just figured since this seemed to be a register oriented type of
> hook that it might make sense to go there. It's one of those things
> that we haven't really done in the past, but that seemed to be because
> of the confusion with what a Subtarget was versus a Target.
>
> Quentin, do you have any thoughts on this?
>
>
> Jim generally has good insight here too. Jim?
>
> -eric
>
> -Chris
>
>
> -eric
>
>
>
> On Tue, Jul 15, 2014 at 7:31 AM, Quentin Colombet <qcolombet at apple.com>
> wrote:
>
> Hi Chris,
>
> Thanks for the update, I like what you did with the MachineFunctionInfo!
>
> LGTM with two comments:
> - AFAICT, we do not put anymore the date in filename for test cases.
> - For the out-of-registers.ll test case, please add some CHECK lines, just
> to be sure we are generating something correct.
>
> Thanks,
> -Quentin
>
> On Jul 14, 2014, at 10:26 PM, Chris Bieneman <cbieneman at apple.com> wrote:
>
> Quentin,
>
> Here are updated diffs with the changes you suggested. I’ve also included
> new test cases, one to test vector spilling, and one to test PR18825.
>
> If there are no other concerns about this patch I will commit it tomorrow.
>
> Thank you Quentin and David for the feedback.
>
> -Chris
> <18825.diff>
>
> On Jul 14, 2014, at 5:00 PM, Chris Bieneman <cbieneman at apple.com> wrote:
>
> Thanks for the review Quentin. It shouldn't be hard to move the DenseMap
> into the MachineFunctionInfo, I'll rework the patch, and I will provide test
> cases for both of the cases you mentioned.
>
> To close the loop on David's earlier question about the llvm performance
> suite. This patch has no impact on the generated code in any of our tests,
> so it will not alter performance.
>
> Thanks,
> -Chris
>
> On Jul 14, 2014, at 3:56 PM, Quentin Colombet <qcolombet at apple.com> wrote:
>
> Hi Chris,
>
> The current patch LGTM, though the DenseMap in the Subtarget is not pretty.
> My main concern is with big module, where the map will keep a lot of useless
> information since we have no way to know when it is fine to clear it.
>
> I was thinking moving that map into the XXXMachineFunctionInfo, not sure how
> feasible is that.
>
> Regarding the final patch, assuming no further feedbacks come, I would
> expect to have a new test case for PR18825 and the update test for
> CodeGen/ARM/vldm-sched-a9.ll.
>
> Thanks,
> -Quentin
>
> On Jul 14, 2014, at 1:23 PM, Chris Bieneman <cbieneman at apple.com> wrote:
>
> This morning I built SPEC with and without my patch and found no codegen
> differences.
>
> -Chris
>
> On Jul 11, 2014, at 5:19 PM, Chris Bieneman <cbieneman at apple.com> wrote:
>
> David,
>
> I agree completely. I'll look into any codegen differences in our test
> suite, and in SPEC.
>
> -Chris
>
>
>
> On Jul 11, 2014, at 4:26 PM, David Peixotto <dpeixott at codeaurora.org> wrote:
>
> Hi Chris,
>
> I agree that there is no simple solution. I am not opposed to your heuristic
> approach. I asked for stats or perf data because I know that the bug rarely
> happens, but I can't tell from reading the code how often the heuristic will
> hit. As you mentioned coalescing can have positive or negative performance
> impact and it would be good to understand if this change has any impact on
> codegen or performance for some larger benchmarks.
>
> That's my opinion anyway :)
>
> -David
>
> -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted
> by The Linux Foundation
>
> -----Original Message-----
> From: Chris Bieneman [mailto:cbieneman at apple.com]
> Sent: Friday, July 11, 2014 2:58 PM
> To: David Peixotto
> Cc: llvm-commits
> Subject: Re: [PATCH] PR18825 - New RegisterCoalescer subtarget hook
>
> Hi David,
>
> I haven’t run SPEC, but I have run all of LLVM’s in tree tests. Among
> those I haven’t found any codegen differences except the one test that I
> called out in my original email, and I suspect that would be a performance
> improvement.
>
> That said, there are certainly cases where blocking coalescing can
> generate worse code. I’ve been emailing back and forth with Quentin about
> potential longer-term solutions to this problem and there really is no
> simple solution.
>
> -Chris
>
> On Jul 11, 2014, at 2:45 PM, David Peixotto <dpeixott at codeaurora.org>
>
> wrote:
>
>
> Hi Chris,
>
> It would be great to get this bug fixed. I have a general question about
>
> the heuristic.
>
>
> I believe the case we need to avoid in the bug is quite rare (running
>
> out of registers), but the heuristic will prevent colaescing in other
> cases as well. Do you have any statistics for how many coalescings are
> prevented by the heuristic and any performance data for spec or llvm test
> suite?
>
>
> Thanks,
> David
>
> -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
> hosted by The Linux Foundation
>
> -----Original Message-----
> From: llvm-commits-bounces at cs.uiuc.edu [mailto:llvm-commits-
> bounces at cs.uiuc.edu] On Behalf Of Chris Bieneman
> Sent: Friday, July 11, 2014 1:45 PM
> To: llvm-commits
> Subject: [PATCH] PR18825 - New RegisterCoalescer subtarget hook
>
> The attached patch adds a new subtarget hook to the
> RegisterCoalescer, and an ARM implementation. The ARM implementation
> works around PR18825, and also better code in cases with a large number
>
> of NEON vectors in use.
>
>
> The basic problem is that the coalescer is very aggressive at
> propagating constraints on the register classes, and the register
> allocator doesn’t know how to split sub-registers later to recover.
> This patch provides an escape valve for targets that encounter this
>
> problem to limit coalescing.
>
>
> A few other notes about this patch:
>
> (1) In code that triggers the heuristic (using a lot of large vector
> registers) this code can improve compile time substantially, and in
> other code it has no adverse impact.
>
> (2) This code also breaks one of the tests in our test suite
> (CodeGen/ARM/vldm-sched-a9.ll). This test is designed to test NEON
> vector spilling, and with this patch the test no longer spills. With
> the final patch I will provide a new test case to test vector
> spilling, and this test will be marked XFAIL so that we don’t regress
> this and start spilling here again.
>
> Thanks,
> -Chris
>
>
>
>
>
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>
>
>
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>
>
>
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>
>
>
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>
>
>




More information about the llvm-commits mailing list