<div dir="ltr"><br><div>This also looks like some problem in later pass that get exposed by tail merging. For instance the two branches in the bottom test should be swapped. Ideally, with tail merging, the code should look like the following:</div><div><br></div><div><div>foo:                                    // @foo</div><div>// BB#0:                                // %entry</div><div>        {</div><div>                        r1 = #0 ; jump .LBB0_2</div><div>        }</div><div>        .p2align        4</div><div>.LBB0_1:                                // %while.cond</div><div>                                        //   in Loop: Header=BB0_2 Depth=1</div><div>        {</div><div>                        r1 = r0</div><div>                        r0 = memw(r0 + #0)</div><div>        }</div><div>.LBB0_2:                                // %while.cond</div><div>                                        // =>This Inner Loop Header: Depth=1</div><div>        {</div><div>                        if (!cmp.eq(r0.new, #0)) jump .LBB0_1</div><div>        }</div><div>        {</div><div>                        jumpr:nt r31</div><div>        }</div><div>.LBB0_3:                                // %if.end</div></div><div><br></div><div>I am fine having special handling to disable cases like this, but not too general.</div><div><br></div><div>David</div><div><br></div></div><div class="gmail_extra"><br><div class="gmail_quote">On Wed, Jul 13, 2016 at 4:32 PM, David Li <span dir="ltr"><<a href="mailto:davidxl@google.com" target="_blank">davidxl@google.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">davidxl added inline comments.<br>
<span class=""><br>
================<br>
Comment at: lib/CodeGen/BranchFolding.cpp:621<br>
@@ -621,1 +620,3 @@<br>
</span><span class="">+  // when SuccBB is the only successor.<br>
+  if ((MBB1 == PredBB || MBB2 == PredBB) && MBB1->succ_size() == 1) {<br>
     MachineBasicBlock::iterator I;<br>
</span>----------------<br>
<div><div class="h5">iteratee wrote:<br>
> davidxl wrote:<br>
> > davidxl wrote:<br>
> > > This looks fine but should it be done only in layout mode?<br>
> > looks like if you merge this patch back to D18226 with a slight modification like this:<br>
> ><br>
> >   if ((MBB1 == PredBB || MBB2 == PredBB) && (MBB1->succ_size() == 1 || !AfterPlacement)) {<br>
> ><br>
> > }<br>
> ><br>
> > D18226 will be good to go in .<br>
> No, I don't think it should be only layout mode.<br>
> rdf-copy is a good example of why. Without this patch:<br>
><br>
><br>
>     foo:                                    // @foo<br>
>     // BB#0:                                // %entry<br>
>             {<br>
>                             r1 = #0 ; jump .LBB0_2<br>
>         }<br>
>              .p2align        4<br>
>      .LBB0_1:                                // %while.cond<br>
>                                              //   in Loop: Header=BB0_2 Depth=1<br>
>              {<br>
>                              r1 = r0<br>
>                              r0 = memw(r0 + #0)<br>
>              }<br>
>      .LBB0_2:                                // %while.cond<br>
>                                              // =>This Inner Loop Header: Depth=1<br>
>              {<br>
>                              if (p0.new) r0 = r1<br>
>                              p0 = cmp.eq(r0, #0)<br>
>                              if (p0.new) jumpr:nt r31<br>
>              }<br>
>              {<br>
>                              jump .LBB0_1<br>
>              }<br>
><br>
> Note the loop is 3 packets long. Now with the patch:<br>
>      foo:                                    // @foo<br>
>      // BB#0:                                // %entry<br>
>              {<br>
>                              r1 = #0<br>
>                              p0 = cmp.eq(r0, #0)<br>
>              }<br>
>              {<br>
>                              if (p0) r0 = r1<br>
>                              if (p0) jumpr:nt r31<br>
>              }<br>
>              .p2align        4<br>
>      .LBB0_1:                                // %while.cond<br>
>                                              // =>This Inner Loop Header: Depth=1<br>
>              {<br>
>                              r1 = r0<br>
>                              r0 = memw(r0 + #0)<br>
>                              if (!cmp.eq(r0.new, #0)) jump:t .LBB0_1<br>
>              }<br>
>      // BB#2:                                // %if.end<br>
>              {<br>
>                              r0 = r1<br>
>                              jumpr r31<br>
>              }<br>
> Now the loop is one packet long.<br>
> There are other cases where unanalyzable fallthrough prevents duplication, but we avoid the problem completely by not merging.<br>
</div></div>But this fix is over-generalized<br>
<br>
The root cause of the problem in that test is that one of tail sequence is in a loop while the other is not. I think this can only happen for loop top-testing and bottom testing, so it seems to me a better general fix is to check if MBB1 or MBB2 dominates the other.<br>
<div class="HOEnZb"><div class="h5"><br>
<br>
Repository:<br>
  rL LLVM<br>
<br>
<a href="http://reviews.llvm.org/D22317" rel="noreferrer" target="_blank">http://reviews.llvm.org/D22317</a><br>
<br>
<br>
<br>
</div></div></blockquote></div><br></div>