[PATCH] D22317: Codegen: Tail Merge: Be less aggressive with special cases.
Kyle Butt via llvm-commits
llvm-commits at lists.llvm.org
Tue Aug 9 11:19:58 PDT 2016
iteratee added inline comments.
================
Comment at: lib/CodeGen/BranchFolding.cpp:621
@@ -621,1 +620,3 @@
+ // when SuccBB is the only successor.
+ if ((MBB1 == PredBB || MBB2 == PredBB) && MBB1->succ_size() == 1) {
MachineBasicBlock::iterator I;
----------------
davidxl wrote:
> iteratee wrote:
> > davidxl wrote:
> > > davidxl wrote:
> > > > This looks fine but should it be done only in layout mode?
> > > looks like if you merge this patch back to D18226 with a slight modification like this:
> > >
> > > if ((MBB1 == PredBB || MBB2 == PredBB) && (MBB1->succ_size() == 1 || !AfterPlacement)) {
> > >
> > > }
> > >
> > > D18226 will be good to go in .
> > No, I don't think it should be only layout mode.
> > rdf-copy is a good example of why. Without this patch:
> >
> >
> > foo: // @foo
> > // BB#0: // %entry
> > {
> > r1 = #0 ; jump .LBB0_2
> > }
> > .p2align 4
> > .LBB0_1: // %while.cond
> > // in Loop: Header=BB0_2 Depth=1
> > {
> > r1 = r0
> > r0 = memw(r0 + #0)
> > }
> > .LBB0_2: // %while.cond
> > // =>This Inner Loop Header: Depth=1
> > {
> > if (p0.new) r0 = r1
> > p0 = cmp.eq(r0, #0)
> > if (p0.new) jumpr:nt r31
> > }
> > {
> > jump .LBB0_1
> > }
> >
> > Note the loop is 3 packets long. Now with the patch:
> > foo: // @foo
> > // BB#0: // %entry
> > {
> > r1 = #0
> > p0 = cmp.eq(r0, #0)
> > }
> > {
> > if (p0) r0 = r1
> > if (p0) jumpr:nt r31
> > }
> > .p2align 4
> > .LBB0_1: // %while.cond
> > // =>This Inner Loop Header: Depth=1
> > {
> > r1 = r0
> > r0 = memw(r0 + #0)
> > if (!cmp.eq(r0.new, #0)) jump:t .LBB0_1
> > }
> > // BB#2: // %if.end
> > {
> > r0 = r1
> > jumpr r31
> > }
> > Now the loop is one packet long.
> > There are other cases where unanalyzable fallthrough prevents duplication, but we avoid the problem completely by not merging.
> But this fix is over-generalized
>
> The root cause of the problem in that test is that one of tail sequence is in a loop while the other is not. I think this can only happen for loop top-testing and bottom testing, so it seems to me a better general fix is to check if MBB1 or MBB2 dominates the other.
The forked diamond code is about to go in.
I still think that these changes are correct in general, but if you still disagree, I can drop this and roll it into D18226 only after placement.
I think the thresholds there need to be adjusted so that there is only one, but I can do that as well.
Repository:
rL LLVM
https://reviews.llvm.org/D22317
More information about the llvm-commits
mailing list