<div dir="ltr">For example, for your first example, the IR is something like this:<div><div><br></div><div>; Function Attrs: nounwind ssp uwtable</div><div>define i32 @test(i32* readnone, i32, i32, i32, i32) local_unnamed_addr #0 {</div><div>  %6 = icmp eq i32* %0, null</div><div>  br i1 %6, label %7, label %9, !prof !2</div><div><br></div><div>; <label>:7:                                      ; preds = %5</div><div>  %8 = tail call i32 (i32, ...) bitcast (i32 (...)* @calli to i32 (i32, ...)*)(i32 %1) #2</div><div>  br label %9</div><div><br></div><div>; <label>:9:                                      ; preds = %7, %5</div><div>  %10 = phi i32  [ 0, %5 ], [ %8, %7 ]</div><div>  %11 = or i32 %1, 1</div><div>  %12 = sdiv i32 %10, %11</div><div>  %13 = or i32 %2, 1</div><div>  %14 = sdiv i32 %12, %13</div><div>  %15 = or i32 %3, 1</div><div>  %16 = sdiv i32 %14, %15</div><div>  %17 = or i32 %4, 1</div><div>  %18 = sdiv i32 %16, %17</div><div>  ret i32 %18</div><div>}</div></div><div><br></div><div>NewGVN actually understands that, for example, %12 == phi(sdiv %8, %11, 0)[1] it just doesn't keep it because it doesn't try to prove that it's free if the rest go away. I could make it do that (which would not require real code duplication, just code movement).</div><div><br></div><div><br></div><div>[1] </div><div><div>Processing instruction   %12 = sdiv i32 %10, %11</div><div>Simplified   <badref> = sdiv i32 0, %11 to  constant i32 0</div><div>Found phi of ops operand i32 0 in %5</div></div><div>Cannot find phi of ops operand for <badref> = sdiv i32 %8, %11 in block %7</div><div><br></div><div><br></div><div><br></div><div><br></div><div><br></div><div><br></div></div><div class="gmail_extra"><br><div class="gmail_quote">On Thu, Aug 31, 2017 at 7:08 PM, Daniel Berlin <span dir="ltr"><<a href="mailto:dberlin@dberlin.org" target="_blank">dberlin@dberlin.org</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr">NewGVN does some of it.<div>It will discover if backpropagation through a phi enables an operation to be put in terms of existing expressions or constants.</div><div>It does not track ranges though, only explicit values.</div><div><br></div><div>In your ORIG block, if LENGTH can be expressed as a merge of  already-computed expressions or constants in the program, it will discover it.</div><div><br></div><div>You can stare at examples of this (and compute quite complicated ones if you wish) at test/Transforms/NewGVN/<wbr>completeness.ll</div><div><br></div><div>Both your examples already have a phi in it in llvm ir, so that is just a matter of the proper range propagation.</div><div>It is really:<br><div><br class="m_7546756379174642208gmail-Apple-interchange-newline">int test()</div><div>{<br> int Ret0 = 0;</div><div>  if (!Ptr)</div><div>     Ret1= calli(a)</div><div>  RetPhi = PHI(Ret0, Ret1)</div><div>  return RetPhi;</div></div><div>}</div><div><br></div><div>It should already be possible to determine the value of Ret0 to be 0 based on ranges without duplicating anything, and the extra return doesn't give you anything. </div><div><br></div><div>If you want to do it based on ranges:  a combination of a value range computation using the existing sparse propagation engine, and a transform at phi nodes like we use for NewGVN, should be able to handle what you want here.    The existing lazy value info may not get an optimal answer for various reasons (it is trying to propagate in the wrong direction, so it loses some context)</div><div><br></div><div>You could also use the same basic infrastructure that predicateinfo uses, and rename info at points the ranges could have changed to get additional precision.</div><div> (unlike predicateinfo, which renames when there are branches). </div><div><br></div><div>There are quite a number of papers on making SSA forms that are effective at range propagation. PredicateInfo is building pruned e-ssa, which is the basis of a number of range analysis (see <a href="http://homepages.dcc.ufmg.br/~fernando/publications/papers/CGO13_raphael.pdf" target="_blank">http://homepages.dcc.<wbr>ufmg.br/~fernando/<wbr>publications/papers/CGO13_<wbr>raphael.pdf</a> and the slides at <a href="http://laure.gonnord.org/pro/research/ER03_2015/RangeAnalysis.pdf" target="_blank">http://laure.gonnord.org/<wbr>pro/research/ER03_2015/<wbr>RangeAnalysis.pdf</a>).</div><div><br></div><div>The only case you should ever need to duplicate code is when the expression is something not already computed in the program.  That's not the case for your first example, it can be done without code duplication.  For your second example, it would require insertion, but you could know ahead of time which computations you actually need to insert.</div><div><br></div><div><br></div></div><div class="HOEnZb"><div class="h5"><div class="gmail_extra"><br><div class="gmail_quote">On Thu, Aug 31, 2017 at 6:22 PM, Hal Finkel via llvm-dev <span dir="ltr"><<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

  
  <div bgcolor="#FFFFFF" text="#000000"><div><div class="m_7546756379174642208h5">

    <p><br>

    </p>

    <div class="m_7546756379174642208m_2536526375415948568moz-cite-prefix">On 08/31/2017 03:54 PM, Tony Jiang via

      llvm-dev wrote:<br>

    </div>

    <blockquote type="cite">

      
      <font size="2" face="Arial">Hi All,</font><br>

      <br>

      <font size="2" face="Arial">We have recently found some

        optimization

        opportunities created by replicating code into branches in order

        to enable

        optimization. In general, the optimization opportunity we are

        pursuing

        is like the following.</font><font size="2" face="Arial"><br>

      </font><font size="2" face="Arial"><br>

        Given pseudo-code:</font><font size="2" face="Arial"><br>

      </font><font size="1" face="Courier New"><br>

        // block A<br>

        if (some condition)<br>

          // block B<br>

        // block C</font><font size="2" face="Arial"><br>

      </font><font size="2" face="Arial"><br>

        If it can be efficiently proven that some portion of block C can

        be simplified

        had control flow not gone through the if statement, it might be

        useful

        to convert this CFG into a diamond and hoist that portion of

        block C into

        both block B and the new block.<br>

         </font><font size="2" face="Arial"><br>

      </font><font size="2" face="Arial"><br>

        Consider the following example:<br>

           </font><font size="2" face="Arial"> </font><br>

      <font size="2" face="Arial"><br>

      </font><font size="1" face="Courier New"><br>

        int test(int *Ptr, int a, int b, int c, int d) { </font><br>

      <font size="1" face="Courier New">  int Ret = 0;<br>

          if (__builtin_expect(!Ptr, 0)) {<br>

            Ret = calli(a);<br>

            // return Ret / (a|1) / (b|1) / (c|1) / (d|1); // Copy

        return

        to here<br>

          }<br>

          return Ret / (a|1) / (b|1) / (c|1) / (d|1); // This can be

        simplified

        to return 0<br>

        }</font><font size="2" face="Arial"> </font><br>

      <font size="2" face="Arial"><br>

        In this case, replicating the return statement in the branch

        allows the

        optimizer to prove the value of Ret at the end of the function

        is 0 and

        eliminate the arithmetic calculations.<br>

         <br>

        A second example:</font><font size="2" face="Arial"> </font><br>

      <br>

      <font size="1" face="Courier New">unsigned long

        funcReturningArbitraryi64(unsi<wbr>gned

        char *p);</font><br>

      <font size="1" face="Courier New">#define LENGTH(uv)  ( (uv) <

        0x80             ? 1 :  \</font><br>

      <font size="1" face="Courier New">         

                    (uv) < 0x800    

               ? 2 :  \</font><br>

      <font size="1" face="Courier New">         

                    (uv) < 0x10000    

             ? 3 :  \</font><br>

      <font size="1" face="Courier New">         

                    (uv) < 0x200000    

            ? 4 :  \</font><br>

      <font size="1" face="Courier New">         

                    (uv) < 0x4000000    

           ? 5 :  \</font><br>

      <font size="1" face="Courier New">         

                    (uv) < 0x80000000    

          ? 6 : 7 )</font><br>

      <br>

      <font size="1" face="Courier New">int func(unsigned char *p, bool

        flag)</font><br>

      <font size="1" face="Courier New">{</font><br>

      <font size="1" face="Courier New">  unsigned long c = *p;</font><br>

      <font size="1" face="Courier New">  int len;</font><br>

      <font size="1" face="Courier New">  // ...</font><br>

      <font size="1" face="Courier New">#ifdef _ORIG</font><br>

      <font size="1" face="Courier New">  if(flag) {</font><br>

      <font size="1" face="Courier New">    // ...</font><br>

      <font size="1" face="Courier New">    c =

        funcReturningArbitraryi64(p);</font><br>

      <font size="1" face="Courier New">  }</font><br>

      <font size="1" face="Courier New">len = LENGTH(c);</font><br>

      <font size="1" face="Courier New">#else</font><br>

      <font size="1" face="Courier New">  if(flag) {</font><br>

      <font size="1" face="Courier New">    // ...</font><br>

      <font size="1" face="Courier New">    c =

        funcReturningArbitraryi64(p);</font><br>

      <font size="1" face="Courier New">    len = LENGTH(c);</font><br>

      <font size="1" face="Courier New">  } else {</font><br>

      <font size="1" face="Courier New">    len = LENGTH(c);</font><br>

      <font size="1" face="Courier New">  }</font><br>

      <font size="1" face="Courier New">#endif</font><br>

      <br>

      <font size="1" face="Courier New">  // ...</font><br>

      <br>

      <font size="1" face="Courier New">  return len;</font><br>

      <font size="1" face="Courier New">}</font><br>

      <font size="2" face="Arial"><br>

        In this case, we see that creating an else block and replicating

        the return

        statement in both the if and else branch (like the code snippet

        guarded

        by the #else) enables the macro UNISKIP in the else branch to be

        optimized.</font><font size="2" face="Arial"><br>

      </font><font size="2" face="Arial"><br>

         <br>

        Most of the examples we have come up with so far are centered

        around value

        ranges along the conditional branches. When the range of values

        a symbol

        can have along different branches is provably different,

        opportunities

        for optimization may arise. However, this likely isn't the only

        category

        of optimizations that could benefit from this.<br>

         <br>

        Is there an existing transformation in LLVM that should be doing

        this already

        that is missing this opportunity? If not, we would like to

        pursue adding

        this. Of course, this optimization would be subject to a cost

        model as

        it may result in code growth. For example, it may not be

        advantageous to

        do this if the simplified branch is cold. If anyone has any

        comments/suggestions

        we are very much interested in hearing them.</font><br>

    </blockquote>

    <br></div></div>

    We have two transformations that track ranges along CFG edges using

    LazyValueInfo: JumpThreading and CorrelatedValuePropagation. I don't

    think that either does what you're proposing.<br>

    <br>

     -Hal<br>

    <br>

    <blockquote type="cite"><span><br>

      <br>

      <br>

      <font size="2" face="Arial">Regards, <br>

        <br>

        <br>

        Tony Jiang, M.Sc.<br>

        LLVM PPC Backend Development<br>

        IBM Toronto Lab, C2/712/8200/MKM<br>

        8200 Warden Ave, Markham, L6G 1C7<br>

        Email: </font><a href="mailto:nemanjai@ca.ibm.com" target="_blank"><font color="blue" size="2" face="Arial"><u>jtony@ca.ibm.com</u></font></a><font size="2" face="Arial"><br>

        Phone: <a href="tel:(905)%20413-3676" value="+19054133676" target="_blank">905-413-3676</a></font><br>

      <br>

      <fieldset class="m_7546756379174642208m_2536526375415948568mimeAttachmentHeader"></fieldset>

      <br>

      </span><pre>______________________________<wbr>_________________

LLVM Developers mailing list

<a class="m_7546756379174642208m_2536526375415948568moz-txt-link-abbreviated" href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a>

<a class="m_7546756379174642208m_2536526375415948568moz-txt-link-freetext" href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" target="_blank">http://lists.llvm.org/cgi-bin/<wbr>mailman/listinfo/llvm-dev</a><span class="m_7546756379174642208HOEnZb"><font color="#888888">

</font></span></pre><span class="m_7546756379174642208HOEnZb"><font color="#888888">

    </font></span></blockquote><span class="m_7546756379174642208HOEnZb"><font color="#888888">

    <br>

    <pre class="m_7546756379174642208m_2536526375415948568moz-signature" cols="72">-- 

Hal Finkel

Lead, Compiler Technology and Programming Languages

Leadership Computing Facility

Argonne National Laboratory</pre>

  </font></span></div>


<br>______________________________<wbr>_________________<br>

LLVM Developers mailing list<br>

<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a><br>

<a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" rel="noreferrer" target="_blank">http://lists.llvm.org/cgi-bin/<wbr>mailman/listinfo/llvm-dev</a><br>

<br></blockquote></div><br></div>

</div></div></blockquote></div><br></div>