[PATCH] D22317: Codegen: Tail Merge: Be less aggressive with special cases.

Wed Jul 13 18:49:43 PDT 2016

This also looks like some problem in later pass that get exposed by tail
merging. For instance the two branches in the bottom test should be
swapped. Ideally, with tail merging, the code should look like the
following:

foo:                                    // @foo
// BB#0:                                // %entry
        {
                        r1 = #0 ; jump .LBB0_2
        }
        .p2align        4
.LBB0_1:                                // %while.cond
                                        //   in Loop: Header=BB0_2 Depth=1
        {
                        r1 = r0
                        r0 = memw(r0 + #0)
        }
.LBB0_2:                                // %while.cond
                                        // =>This Inner Loop Header: Depth=1
        {
                        if (!cmp.eq(r0.new, #0)) jump .LBB0_1
        }
        {
                        jumpr:nt r31
        }
.LBB0_3:                                // %if.end

I am fine having special handling to disable cases like this, but not too
general.

David

On Wed, Jul 13, 2016 at 4:32 PM, David Li <davidxl at google.com> wrote:

> davidxl added inline comments.
>
> ================
> Comment at: lib/CodeGen/BranchFolding.cpp:621
> @@ -621,1 +620,3 @@
> +  // when SuccBB is the only successor.
> +  if ((MBB1 == PredBB || MBB2 == PredBB) && MBB1->succ_size() == 1) {
>      MachineBasicBlock::iterator I;
> ----------------
> iteratee wrote:
> > davidxl wrote:
> > > davidxl wrote:
> > > > This looks fine but should it be done only in layout mode?
> > > looks like if you merge this patch back to D18226 with a slight
> modification like this:
> > >
> > >   if ((MBB1 == PredBB || MBB2 == PredBB) && (MBB1->succ_size() == 1 ||
> !AfterPlacement)) {
> > >
> > > }
> > >
> > > D18226 will be good to go in .
> > No, I don't think it should be only layout mode.
> > rdf-copy is a good example of why. Without this patch:
> >
> >
> >     foo:                                    // @foo
> >     // BB#0:                                // %entry
> >             {
> >                             r1 = #0 ; jump .LBB0_2
> >         }
> >              .p2align        4
> >      .LBB0_1:                                // %while.cond
> >                                              //   in Loop: Header=BB0_2
> Depth=1
> >              {
> >                              r1 = r0
> >                              r0 = memw(r0 + #0)
> >              }
> >      .LBB0_2:                                // %while.cond
> >                                              // =>This Inner Loop
> Header: Depth=1
> >              {
> >                              if (p0.new) r0 = r1
> >                              p0 = cmp.eq(r0, #0)
> >                              if (p0.new) jumpr:nt r31
> >              }
> >              {
> >                              jump .LBB0_1
> >              }
> >
> > Note the loop is 3 packets long. Now with the patch:
> >      foo:                                    // @foo
> >      // BB#0:                                // %entry
> >              {
> >                              r1 = #0
> >                              p0 = cmp.eq(r0, #0)
> >              }
> >              {
> >                              if (p0) r0 = r1
> >                              if (p0) jumpr:nt r31
> >              }
> >              .p2align        4
> >      .LBB0_1:                                // %while.cond
> >                                              // =>This Inner Loop
> Header: Depth=1
> >              {
> >                              r1 = r0
> >                              r0 = memw(r0 + #0)
> >                              if (!cmp.eq(r0.new, #0)) jump:t .LBB0_1
> >              }
> >      // BB#2:                                // %if.end
> >              {
> >                              r0 = r1
> >                              jumpr r31
> >              }
> > Now the loop is one packet long.
> > There are other cases where unanalyzable fallthrough prevents
> duplication, but we avoid the problem completely by not merging.
> But this fix is over-generalized
>
> The root cause of the problem in that test is that one of tail sequence is
> in a loop while the other is not. I think this can only happen for loop
> top-testing and bottom testing, so it seems to me a better general fix is
> to check if MBB1 or MBB2 dominates the other.
>
>
> Repository:
>   rL LLVM
>
> http://reviews.llvm.org/D22317
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20160713/1421601b/attachment.html>