<div dir="ltr">Note that this kind of "graph CSE" is the basis of HVN/HU/HRU in <div><a href="https://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=2&ved=0ahUKEwjEupa6t93OAhUE1mMKHbsCDU4QFggeMAE&url=https%3A%2F%2Fwww.cs.ucsb.edu%2F~benh%2Fresearch%2Fpapers%2Fhardekopf07exploiting.pdf&usg=AFQjCNGNzQ6vgsfxWRMW3a5aA4fGGLADOQ&sig2=3mC64txCKq5daxQqMK7UIA&bvm=bv.130731782,d.cGc">https://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=2&ved=0ahUKEwjEupa6t93OAhUE1mMKHbsCDU4QFggeMAE&url=https%3A%2F%2Fwww.cs.ucsb.edu%2F~benh%2Fresearch%2Fpapers%2Fhardekopf07exploiting.pdf&usg=AFQjCNGNzQ6vgsfxWRMW3a5aA4fGGLADOQ&sig2=3mC64txCKq5daxQqMK7UIA&bvm=bv.130731782,d.cGc</a><br></div><div><br></div><div>You could do the same analysis on this graph.</div><div><br></div><div><br></div></div><div class="gmail_extra"><br><div class="gmail_quote">On Thu, Aug 25, 2016 at 10:17 AM, David Callahan <span dir="ltr"><<a href="mailto:dcallahan@fb.com" target="_blank">dcallahan@fb.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div style="word-wrap:break-word;color:rgb(0,0,0);font-size:14px;font-family:Calibri,sans-serif">
<div>Here is a summary of the experiment behind this patch</div>
<div><a href="https://www.facebook.com/notes/david-callahan/llvm-cfl-alias-analysis/10150648030284971" target="_blank">https://www.facebook.com/<wbr>notes/david-callahan/llvm-cfl-<wbr>alias-analysis/<wbr>10150648030284971</a></div>
<div><br>
</div>
<span>
<div style="font-family:Calibri;font-size:11pt;text-align:left;color:black;BORDER-BOTTOM:medium none;BORDER-LEFT:medium none;PADDING-BOTTOM:0in;PADDING-LEFT:0in;PADDING-RIGHT:0in;BORDER-TOP:#b5c4df 1pt solid;BORDER-RIGHT:medium none;PADDING-TOP:3pt">
<span style="font-weight:bold">From: </span>Daniel Berlin <<a href="mailto:dberlin@dberlin.org" target="_blank">dberlin@dberlin.org</a>><br>
<span style="font-weight:bold">Date: </span>Thursday, August 25, 2016 at 9:55 AM<div><div class="h5"><br>
<span style="font-weight:bold">To: </span>David Callahan <<a href="mailto:dcallahan@fb.com" target="_blank">dcallahan@fb.com</a>><br>
<span style="font-weight:bold">Cc: </span>George Burgess IV <<a href="mailto:george.burgess.iv@gmail.com" target="_blank">george.burgess.iv@gmail.com</a>>, LLVM Dev Mailing list <<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a>><br>
<span style="font-weight:bold">Subject: </span>Re: [llvm-dev] CFLAA<br>
</div></div></div><div><div class="h5">
<div><br>
</div>
<div>
<div>
<div dir="ltr">(and sys::cas_flag that STATISTIC uses is a uint32 ...)</div>
<div class="gmail_extra"><br>
<div class="gmail_quote">On Thu, Aug 25, 2016 at 9:54 AM, Daniel Berlin <span dir="ltr">
<<a href="mailto:dberlin@dberlin.org" target="_blank">dberlin@dberlin.org</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div dir="ltr">Okay, dumb question:<br>
Are you really getting negative numbers in the second column?<span>
<div><br>
</div>
<div>
<p class="MsoNormal" style="margin:0in 0in 0.0001pt;font-size:14px;font-family:Calibri,sans-serif;color:rgb(0,0,0)">
<span style="font-size:9pt;font-family:Consolas"> 526,766 -136 mem2reg # PHI nodes inserted<u></u><u></u></span></p>
</div>
<div><span style="font-size:9pt;font-family:Consolas"><br>
</span></div>
</span>
<div><font face="Consolas"><span style="font-size:12px"><a href="https://urldefense.proofpoint.com/v2/url?u=http-3A__llvm.org_docs_doxygen_html_PromoteMemoryToRegister-5F8cpp-5Fsource.html&d=DQMFaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=lFyiPUrFdOHdaobP7i4hoA&m=dU2DwQMtmgh6cAhA6xnKC8SHBBR9WVaODVF8fDARIjk&s=2ZH2FuAaE0z-bdSkGBtIDBeN1Qw-28jivYLXVy158sU&e=" target="_blank">http://llvm.org/docs/doxygen/h<wbr>tml/PromoteMemoryToRegister_8c<wbr>pp_source.html</a></span></font><br>
</div>
<div><font face="Consolas"><span style="font-size:12px">(Search for NumPHIInsert).</span></font></div>
<div><font face="Consolas"><span style="font-size:12px"><br>
I don't see how it could be negative unless this wrapped around?<br>
<br>
</span></font></div>
</div>
<div>
<div>
<div class="gmail_extra"><br>
<div class="gmail_quote">On Thu, Aug 25, 2016 at 9:49 AM, David Callahan <span dir="ltr">
<<a href="mailto:dcallahan@fb.com" target="_blank">dcallahan@fb.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div style="word-wrap:break-word">
<div>
<p class="MsoNormal"><font face="Consolas"><span style="font-size:12px">I</span></font><span style="color:rgb(0,0,0);font-family:Consolas;font-size:9pt"> did gathered aggregate </span><span style="font-family:Consolas;font-size:9pt">statistics reported by “-stats”
over the ~400 test files. </span></p>
<p class="MsoNormal"><span style="font-family:Consolas;font-size:9pt">The following </span><span style="font-family:Consolas;font-size:9pt">table summarizes the impact.</span><span style="font-family:Consolas;font-size:9pt">
</span><span style="font-family:Consolas;font-size:9pt">The first column is the</span></p>
<p class="MsoNormal" style="color:rgb(0,0,0);font-family:Calibri,sans-serif;font-size:14px">
<span style="font-size:9.0pt;font-family:Consolas">sum where the new analysis is enabled, the second column is the<u></u><u></u></span></p>
<p class="MsoNormal"><font face="Consolas"><span style="font-size:9pt">delta from baseline where no CFL alias analysis is performed. </span><span style="font-size:12px">I</span><span style="font-size:9pt"> am not <u></u><u></u></span></font></p>
<p class="MsoNormal"><font face="Consolas"><span style="font-size:12px">e</span><span style="font-size:9pt">xperienced </span><span style="font-size:12px">enough</span><span style="font-size:9pt"> to know </span><span style="font-size:12px">which</span><span style="font-size:9pt"> of
these are </span><span style="font-size:12px">“</span><span style="font-size:9pt">good</span><span style="font-size:12px">”</span><span style="font-size:9pt"> or </span><span style="font-size:12px">“</span><span style="font-size:9pt">bad</span><span style="font-size:12px">”</span><span style="font-size:9pt"> indicators<wbr>.</span></font></p>
<p class="MsoNormal"><font face="Consolas"><span style="font-size:12px">—</span><span style="font-size:9pt">david</span></font></p>
<p class="MsoNormal" style="color:rgb(0,0,0);font-family:Calibri,sans-serif;font-size:14px">
<span style="font-size:9.0pt;font-family:Consolas"> </span></p>
<p class="MsoNormal" style="color:rgb(0,0,0);font-family:Calibri,sans-serif;font-size:14px">
<span style="font-size:9.0pt;font-family:Consolas"> 72,250 685 SLP # vector instructions generated<u></u><u></u></span></p>
<p class="MsoNormal" style="color:rgb(0,0,0);font-family:Calibri,sans-serif;font-size:14px">
<span style="font-size:9.0pt;font-family:Consolas"> 1,256,401 566 adce # instructions removed<u></u><u></u></span></p>
<p class="MsoNormal" style="color:rgb(0,0,0);font-family:Calibri,sans-serif;font-size:14px">
<span style="font-size:9.0pt;font-family:Consolas"> 67,020,774 13,835,126 basicaa # times a GEP is decomposed<u></u><u></u></span></p>
<p class="MsoNormal" style="color:rgb(0,0,0);font-family:Calibri,sans-serif;font-size:14px">
<span style="font-size:9.0pt;font-family:Consolas"> 11,154 26 basicaa # times the limit to decompose GEPs is reached<u></u><u></u></span></p>
<p class="MsoNormal" style="color:rgb(0,0,0);font-family:Calibri,sans-serif;font-size:14px">
<span style="font-size:9.0pt;font-family:Consolas"> 153,613 324 bdce # instructions removed (unused)<u></u><u></u></span></p>
<p class="MsoNormal" style="color:rgb(0,0,0);font-family:Calibri,sans-serif;font-size:14px">
<span style="font-size:9.0pt;font-family:Consolas"> 198,495 2 bdce # instructions trivialized (dead bits)<u></u><u></u></span></p>
<p class="MsoNormal" style="color:rgb(0,0,0);font-family:Calibri,sans-serif;font-size:14px">
<span style="font-size:9.0pt;font-family:Consolas"> 298,621 0 cfl-od-aa Maximum compressed graph<u></u><u></u></span></p>
<p class="MsoNormal" style="color:rgb(0,0,0);font-family:Calibri,sans-serif;font-size:14px">
<span style="font-size:9.0pt;font-family:Consolas"> 58,462,719 0 cfl-od-aa Number Search Steps<u></u><u></u></span></p>
<p class="MsoNormal" style="color:rgb(0,0,0);font-family:Calibri,sans-serif;font-size:14px">
<span style="font-size:9.0pt;font-family:Consolas"> 48,401 0 cfl-od-aa # NoAlias results absed on address roots<u></u><u></u></span></p>
<p class="MsoNormal" style="color:rgb(0,0,0);font-family:Calibri,sans-serif;font-size:14px">
<span style="font-size:9.0pt;font-family:Consolas"> 61,936 0 cfl-od-aa # NoAlias results on compressed search path<u></u><u></u></span></p>
<p class="MsoNormal" style="color:rgb(0,0,0);font-family:Calibri,sans-serif;font-size:14px">
<span style="font-size:9.0pt;font-family:Consolas"> 3,768,131 0 cfl-od-aa # NoAlias results on fast path<u></u><u></u></span></p>
<p class="MsoNormal" style="color:rgb(0,0,0);font-family:Calibri,sans-serif;font-size:14px">
<span style="font-size:9.0pt;font-family:Consolas"> 47,016,909 0 cfl-od-aa # calls to query()<u></u><u></u></span></p>
<p class="MsoNormal" style="color:rgb(0,0,0);font-family:Calibri,sans-serif;font-size:14px">
<span style="font-size:9.0pt;font-family:Consolas"> 43,172,261 0 cfl-od-aa # instructions analyzed<u></u><u></u></span></p>
<p class="MsoNormal" style="color:rgb(0,0,0);font-family:Calibri,sans-serif;font-size:14px">
<span style="font-size:9.0pt;font-family:Consolas"> 10,515,257 0 cfl-od-aa # times there was no graph node for a value<u></u><u></u></span></p>
<p class="MsoNormal" style="color:rgb(0,0,0);font-family:Calibri,sans-serif;font-size:14px">
<span style="font-size:9.0pt;font-family:Consolas"> 9,895,755 0 cfl-od-aa Total size of compressed graphs (edges)<u></u><u></u></span></p>
<p class="MsoNormal" style="color:rgb(0,0,0);font-family:Calibri,sans-serif;font-size:14px">
<span style="font-size:9.0pt;font-family:Consolas"> 2,797 2 correlated-value-propagation # comparisons propagated<u></u><u></u></span></p>
<p class="MsoNormal" style="color:rgb(0,0,0);font-family:Calibri,sans-serif;font-size:14px">
<span style="font-size:9.0pt;font-family:Consolas"> 66,515 -126 correlated-value-propagation # phis propagated<u></u><u></u></span></p>
<p class="MsoNormal" style="color:rgb(0,0,0);font-family:Calibri,sans-serif;font-size:14px">
<span style="font-size:9.0pt;font-family:Consolas"> 912 3 correlated-value-propagation # sdiv converted to udiv<u></u><u></u></span></p>
<p class="MsoNormal" style="color:rgb(0,0,0);font-family:Calibri,sans-serif;font-size:14px">
<span style="font-size:9.0pt;font-family:Consolas"> 13,527 501 dse # other instrs removed<u></u><u></u></span></p>
<p class="MsoNormal" style="color:rgb(0,0,0);font-family:Calibri,sans-serif;font-size:14px">
<span style="font-size:9.0pt;font-family:Consolas"> 40,973 416 dse # stores deleted<u></u><u></u></span></p>
<p class="MsoNormal" style="color:rgb(0,0,0);font-family:Calibri,sans-serif;font-size:14px">
<span style="font-size:9.0pt;font-family:Consolas"> 126 -2 early-cse # compare instructions CVP'd<u></u><u></u></span></p>
<p class="MsoNormal" style="color:rgb(0,0,0);font-family:Calibri,sans-serif;font-size:14px">
<span style="font-size:9.0pt;font-family:Consolas"> 1,824,703 -138 early-cse # instructions CSE'd<u></u><u></u></span></p>
<p class="MsoNormal" style="color:rgb(0,0,0);font-family:Calibri,sans-serif;font-size:14px">
<span style="font-size:9.0pt;font-family:Consolas"> 1,875,417 87 early-cse # instructions simplified or DCE'd<u></u><u></u></span></p>
<p class="MsoNormal" style="color:rgb(0,0,0);font-family:Calibri,sans-serif;font-size:14px">
<span style="font-size:9.0pt;font-family:Consolas"> 62,505 1 functionattrs # arguments marked nocapture<u></u><u></u></span></p>
<p class="MsoNormal" style="color:rgb(0,0,0);font-family:Calibri,sans-serif;font-size:14px">
<span style="font-size:9.0pt;font-family:Consolas"> 29,979 1 functionattrs # arguments marked readonly<u></u><u></u></span></p>
<p class="MsoNormal" style="color:rgb(0,0,0);font-family:Calibri,sans-serif;font-size:14px">
<span style="font-size:9.0pt;font-family:Consolas"> 42,648 37 globaldce # functions removed<u></u><u></u></span></p>
<p class="MsoNormal" style="color:rgb(0,0,0);font-family:Calibri,sans-serif;font-size:14px">
<span style="font-size:9.0pt;font-family:Consolas"> 40,498 10 globaldce # global variables removed<u></u><u></u></span></p>
<p class="MsoNormal" style="color:rgb(0,0,0);font-family:Calibri,sans-serif;font-size:14px">
<span style="font-size:9.0pt;font-family:Consolas"> 4,368 35 gvn # blocks merged<u></u><u></u></span></p>
<p class="MsoNormal" style="color:rgb(0,0,0);font-family:Calibri,sans-serif;font-size:14px">
<span style="font-size:9.0pt;font-family:Consolas"> 21,961 26 gvn # equalities propagated<u></u><u></u></span></p>
<p class="MsoNormal" style="color:rgb(0,0,0);font-family:Calibri,sans-serif;font-size:14px">
<span style="font-size:9.0pt;font-family:Consolas"> 29,434 45 gvn # instructions PRE'd<u></u><u></u></span></p>
<p class="MsoNormal" style="color:rgb(0,0,0);font-family:Calibri,sans-serif;font-size:14px">
<span style="font-size:9.0pt;font-family:Consolas"> 631,597 3,307 gvn # instructions deleted<u></u><u></u></span></p>
<p class="MsoNormal" style="color:rgb(0,0,0);font-family:Calibri,sans-serif;font-size:14px">
<span style="font-size:9.0pt;font-family:Consolas"> 217,618 494 gvn # instructions simplified<u></u><u></u></span></p>
<p class="MsoNormal" style="color:rgb(0,0,0);font-family:Calibri,sans-serif;font-size:14px">
<span style="font-size:9.0pt;font-family:Consolas"> 51,089 634 gvn # loads PRE'd<u></u><u></u></span></p>
<p class="MsoNormal" style="color:rgb(0,0,0);font-family:Calibri,sans-serif;font-size:14px">
<span style="font-size:9.0pt;font-family:Consolas"> 135,568 1,526 gvn # loads deleted<u></u><u></u></span></p>
<p class="MsoNormal" style="color:rgb(0,0,0);font-family:Calibri,sans-serif;font-size:14px">
<span style="font-size:9.0pt;font-family:Consolas"> 2,197 4 indvars # IV comparisons eliminated<u></u><u></u></span></p>
<p class="MsoNormal" style="color:rgb(0,0,0);font-family:Calibri,sans-serif;font-size:14px">
<span style="font-size:9.0pt;font-family:Consolas"> 826 8 indvars # congruent IVs eliminated<u></u><u></u></span></p>
<p class="MsoNormal" style="color:rgb(0,0,0);font-family:Calibri,sans-serif;font-size:14px">
<span style="font-size:9.0pt;font-family:Consolas"> 2,538 4 indvars # exit values replaced<u></u><u></u></span></p>
<p class="MsoNormal" style="color:rgb(0,0,0);font-family:Calibri,sans-serif;font-size:14px">
<span style="font-size:9.0pt;font-family:Consolas"> 1,856 1 indvars # loop exit tests replaced<u></u><u></u></span></p>
<p class="MsoNormal" style="color:rgb(0,0,0);font-family:Calibri,sans-serif;font-size:14px">
<span style="font-size:9.0pt;font-family:Consolas"> 5,740,738 8 inline # caller-callers analyzed<u></u><u></u></span></p>
<p class="MsoNormal" style="color:rgb(0,0,0);font-family:Calibri,sans-serif;font-size:14px">
<span style="font-size:9.0pt;font-family:Consolas"> 1,629,169 3 inline # functions deleted because all callers found<u></u><u></u></span></p>
<p class="MsoNormal" style="color:rgb(0,0,0);font-family:Calibri,sans-serif;font-size:14px">
<span style="font-size:9.0pt;font-family:Consolas"> 3,563,497 2 inline # functions inlined<u></u><u></u></span></p>
<p class="MsoNormal" style="color:rgb(0,0,0);font-family:Calibri,sans-serif;font-size:14px">
<span style="font-size:9.0pt;font-family:Consolas"> 10,879,125 86 inline-cost # call sites analyzed<u></u><u></u></span></p>
<p class="MsoNormal" style="color:rgb(0,0,0);font-family:Calibri,sans-serif;font-size:14px">
<span style="font-size:9.0pt;font-family:Consolas"> 34,766 5 instcombine # constant folds<u></u><u></u></span></p>
<p class="MsoNormal" style="color:rgb(0,0,0);font-family:Calibri,sans-serif;font-size:14px">
<span style="font-size:9.0pt;font-family:Consolas"> 3,979,078 2,004 instcombine # dead inst eliminated<u></u><u></u></span></p>
<p class="MsoNormal" style="color:rgb(0,0,0);font-family:Calibri,sans-serif;font-size:14px">
<span style="font-size:9.0pt;font-family:Consolas"> 6,323 2 instcombine # dead stores eliminated<u></u><u></u></span></p>
<p class="MsoNormal" style="color:rgb(0,0,0);font-family:Calibri,sans-serif;font-size:14px">
<span style="font-size:9.0pt;font-family:Consolas"> 1,522 4 instcombine # factorizations<u></u><u></u></span></p>
<p class="MsoNormal" style="color:rgb(0,0,0);font-family:Calibri,sans-serif;font-size:14px">
<span style="font-size:9.0pt;font-family:Consolas"> 254,146 66 instcombine # instructions sunk<u></u><u></u></span></p>
<p class="MsoNormal" style="color:rgb(0,0,0);font-family:Calibri,sans-serif;font-size:14px">
<span style="font-size:9.0pt;font-family:Consolas"> 10,427,131 1,749 instcombine # insts combined<u></u><u></u></span></p>
<p class="MsoNormal" style="color:rgb(0,0,0);font-family:Calibri,sans-serif;font-size:14px">
<span style="font-size:9.0pt;font-family:Consolas"> 57,943 -205 instcombine # reassociations<u></u><u></u></span></p>
<p class="MsoNormal" style="color:rgb(0,0,0);font-family:Calibri,sans-serif;font-size:14px">
<span style="font-size:9.0pt;font-family:Consolas"> 1,072 1 instsimplify # expansions<u></u><u></u></span></p>
<p class="MsoNormal" style="color:rgb(0,0,0);font-family:Calibri,sans-serif;font-size:14px">
<span style="font-size:9.0pt;font-family:Consolas"> 135,129 1 instsimplify # reassociations<u></u><u></u></span></p>
<p class="MsoNormal" style="color:rgb(0,0,0);font-family:Calibri,sans-serif;font-size:14px">
<span style="font-size:9.0pt;font-family:Consolas"> 121,777 246 instsimplify # redundant instructions removed<u></u><u></u></span></p>
<p class="MsoNormal" style="color:rgb(0,0,0);font-family:Calibri,sans-serif;font-size:14px">
<span style="font-size:9.0pt;font-family:Consolas"> 27,612 -12 jump-threading # branch blocks duplicated to eliminate phi<u></u><u></u></span></p>
<p class="MsoNormal" style="color:rgb(0,0,0);font-family:Calibri,sans-serif;font-size:14px">
<span style="font-size:9.0pt;font-family:Consolas"> 76,000 197 jump-threading # jumps threaded<u></u><u></u></span></p>
<p class="MsoNormal" style="color:rgb(0,0,0);font-family:Calibri,sans-serif;font-size:14px">
<span style="font-size:9.0pt;font-family:Consolas"> 4,991 8 jump-threading # terminators folded<u></u><u></u></span></p>
<p class="MsoNormal" style="color:rgb(0,0,0);font-family:Calibri,sans-serif;font-size:14px">
<span style="font-size:9.0pt;font-family:Consolas"> 869,838 1,370 lcssa # live out of a loop variables<u></u><u></u></span></p>
<p class="MsoNormal" style="color:rgb(0,0,0);font-family:Calibri,sans-serif;font-size:14px">
<span style="font-size:9.0pt;font-family:Consolas"> 345,329 433 licm # instructions hoisted out of loop<u></u><u></u></span></p>
<p class="MsoNormal" style="color:rgb(0,0,0);font-family:Calibri,sans-serif;font-size:14px">
<span style="font-size:9.0pt;font-family:Consolas"> 702 -27 licm # instructions sunk out of loop<u></u><u></u></span></p>
<p class="MsoNormal" style="color:rgb(0,0,0);font-family:Calibri,sans-serif;font-size:14px">
<span style="font-size:9.0pt;font-family:Consolas"> 19,520 192 licm # load insts hoisted or sunk<u></u><u></u></span></p>
<p class="MsoNormal" style="color:rgb(0,0,0);font-family:Calibri,sans-serif;font-size:14px">
<span style="font-size:9.0pt;font-family:Consolas"> 202 37 licm # memory locations promoted to registers<u></u><u></u></span></p>
<p class="MsoNormal" style="color:rgb(0,0,0);font-family:Calibri,sans-serif;font-size:14px">
<span style="font-size:9.0pt;font-family:Consolas"> 467,244 246 local # unreachable basic blocks removed<u></u><u></u></span></p>
<p class="MsoNormal" style="color:rgb(0,0,0);font-family:Calibri,sans-serif;font-size:14px">
<span style="font-size:9.0pt;font-family:Consolas"> 1,586 34 loop-delete # loops deleted<u></u><u></u></span></p>
<p class="MsoNormal" style="color:rgb(0,0,0);font-family:Calibri,sans-serif;font-size:14px">
<span style="font-size:9.0pt;font-family:Consolas"> 84 27 loop-idiom # memcpy's formed from loop load+stores<u></u><u></u></span></p>
<p class="MsoNormal" style="color:rgb(0,0,0);font-family:Calibri,sans-serif;font-size:14px">
<span style="font-size:9.0pt;font-family:Consolas"> 752 7 loop-idiom # memset's formed from loop stores<u></u><u></u></span></p>
<p class="MsoNormal" style="color:rgb(0,0,0);font-family:Calibri,sans-serif;font-size:14px">
<span style="font-size:9.0pt;font-family:Consolas"> 63,364 -8 loop-rotate # loops rotated<u></u><u></u></span></p>
<p class="MsoNormal" style="color:rgb(0,0,0);font-family:Calibri,sans-serif;font-size:14px">
<span style="font-size:9.0pt;font-family:Consolas"> 4,602 1 loop-simplify # nested loops split out<u></u><u></u></span></p>
<p class="MsoNormal" style="color:rgb(0,0,0);font-family:Calibri,sans-serif;font-size:14px">
<span style="font-size:9.0pt;font-family:Consolas"> 1,244,741 472 loop-simplify # pre-header or exit blocks inserted<u></u><u></u></span></p>
<p class="MsoNormal" style="color:rgb(0,0,0);font-family:Calibri,sans-serif;font-size:14px">
<span style="font-size:9.0pt;font-family:Consolas"> 2,847 2 loop-unroll # loops completely unrolled<u></u><u></u></span></p>
<p class="MsoNormal" style="color:rgb(0,0,0);font-family:Calibri,sans-serif;font-size:14px">
<span style="font-size:9.0pt;font-family:Consolas"> 9,668 -29 loop-unroll # loops unrolled (completely or otherwise)<u></u><u></u></span></p>
<p class="MsoNormal" style="color:rgb(0,0,0);font-family:Calibri,sans-serif;font-size:14px">
<span style="font-size:9.0pt;font-family:Consolas"> 5,799 -35 loop-unroll # loops unrolled with run-time trip counts<u></u><u></u></span></p>
<p class="MsoNormal" style="color:rgb(0,0,0);font-family:Calibri,sans-serif;font-size:14px">
<span style="font-size:9.0pt;font-family:Consolas"> 3,863 25 loop-unswitch # branches unswitched<u></u><u></u></span></p>
<p class="MsoNormal" style="color:rgb(0,0,0);font-family:Calibri,sans-serif;font-size:14px">
<span style="font-size:9.0pt;font-family:Consolas"> 1,054,060 1,482 loop-unswitch Total number of instructions analyzed<u></u><u></u></span></p>
<p class="MsoNormal" style="color:rgb(0,0,0);font-family:Calibri,sans-serif;font-size:14px">
<span style="font-size:9.0pt;font-family:Consolas"> 109,279 -3 loop-vectorize # loops analyzed for vectorization<u></u><u></u></span></p>
<p class="MsoNormal" style="color:rgb(0,0,0);font-family:Calibri,sans-serif;font-size:14px">
<span style="font-size:9.0pt;font-family:Consolas"> 526,766 -136 mem2reg # PHI nodes inserted<u></u><u></u></span></p>
<p class="MsoNormal" style="color:rgb(0,0,0);font-family:Calibri,sans-serif;font-size:14px">
<span style="font-size:9.0pt;font-family:Consolas"> 4,150,078 -3 mem2reg # alloca's promoted with a single store<u></u><u></u></span></p>
<p class="MsoNormal" style="color:rgb(0,0,0);font-family:Calibri,sans-serif;font-size:14px">
<span style="font-size:9.0pt;font-family:Consolas"> 4,567 6 memcpyopt # memcpy instructions deleted<u></u><u></u></span></p>
<p class="MsoNormal" style="color:rgb(0,0,0);font-family:Calibri,sans-serif;font-size:14px">
<span style="font-size:9.0pt;font-family:Consolas"> 96 1 memcpyopt # memcpys converted to memset<u></u><u></u></span></p>
<p class="MsoNormal" style="color:rgb(0,0,0);font-family:Calibri,sans-serif;font-size:14px">
<span style="font-size:9.0pt;font-family:Consolas"> 1,074 173 memcpyopt # memmoves converted to memcpy<u></u><u></u></span></p>
<p class="MsoNormal" style="color:rgb(0,0,0);font-family:Calibri,sans-serif;font-size:14px">
<span style="font-size:9.0pt;font-family:Consolas"> 39,584 6 memcpyopt # memsets inferred<u></u><u></u></span></p>
<p class="MsoNormal" style="color:rgb(0,0,0);font-family:Calibri,sans-serif;font-size:14px">
<span style="font-size:9.0pt;font-family:Consolas"> 179,629 2,475 memdep # block queries that were completely cached<u></u><u></u></span></p>
<p class="MsoNormal" style="color:rgb(0,0,0);font-family:Calibri,sans-serif;font-size:14px">
<span style="font-size:9.0pt;font-family:Consolas"> 1,020 -3 memdep # cached, but dirty, non-local ptr responses<u></u><u></u></span></p>
<p class="MsoNormal" style="color:rgb(0,0,0);font-family:Calibri,sans-serif;font-size:14px">
<span style="font-size:9.0pt;font-family:Consolas"> 9,108,504 146,792 memdep # fully cached non-local ptr responses<u></u><u></u></span></p>
<p class="MsoNormal" style="color:rgb(0,0,0);font-family:Calibri,sans-serif;font-size:14px">
<span style="font-size:9.0pt;font-family:Consolas"> 11,678,674 92,225 memdep # uncached non-local ptr responses<u></u><u></u></span></p>
<p class="MsoNormal" style="color:rgb(0,0,0);font-family:Calibri,sans-serif;font-size:14px">
<span style="font-size:9.0pt;font-family:Consolas"> 399,802 1,832 memory-builtins # arguments with unsolved size and offset<u></u><u></u></span></p>
<p class="MsoNormal" style="color:rgb(0,0,0);font-family:Calibri,sans-serif;font-size:14px">
<span style="font-size:9.0pt;font-family:Consolas"> 10,844 -24,169 memory-builtins # load instructions with unsolved size and offset<u></u><u></u></span></p>
<p class="MsoNormal" style="color:rgb(0,0,0);font-family:Calibri,sans-serif;font-size:14px">
<span style="font-size:9.0pt;font-family:Consolas"> 188,181 54 reassociate # insts reassociated<u></u><u></u></span></p>
<p class="MsoNormal" style="color:rgb(0,0,0);font-family:Calibri,sans-serif;font-size:14px">
<span style="font-size:9.0pt;font-family:Consolas"> 87,009 -82 scalar-evolution # loops with predictable loop counts<u></u><u></u></span></p>
<p class="MsoNormal" style="color:rgb(0,0,0);font-family:Calibri,sans-serif;font-size:14px">
<span style="font-size:9.0pt;font-family:Consolas"> 402,724 71 scalar-evolution # loops without predictable loop counts<u></u><u></u></span></p>
<p class="MsoNormal" style="color:rgb(0,0,0);font-family:Calibri,sans-serif;font-size:14px">
<span style="font-size:9.0pt;font-family:Consolas"> 133,310 72 sccp # basic blocks unreachable<u></u><u></u></span></p>
<p class="MsoNormal" style="color:rgb(0,0,0);font-family:Calibri,sans-serif;font-size:14px">
<span style="font-size:9.0pt;font-family:Consolas"> 275,949 263 sccp # instructions removed<u></u><u></u></span></p>
<p class="MsoNormal" style="color:rgb(0,0,0);font-family:Calibri,sans-serif;font-size:14px">
<span style="font-size:9.0pt;font-family:Consolas"> 2,056,414 723 simplifycfg # blocks simplified<u></u><u></u></span></p>
<p class="MsoNormal" style="color:rgb(0,0,0);font-family:Calibri,sans-serif;font-size:14px">
<span style="font-size:9.0pt;font-family:Consolas"> 5,292 -36 simplifycfg # common instructions sunk down to the end block<u></u><u></u></span></p>
<p class="MsoNormal" style="color:rgb(0,0,0);font-family:Calibri,sans-serif;font-size:14px">
<span style="font-size:9.0pt;font-family:Consolas"> 15,110 1 simplifycfg # speculative executed instructions<u></u><u></u></span></p>
<p class="MsoNormal" style="color:rgb(0,0,0);font-family:Calibri,sans-serif;font-size:14px">
<span style="font-size:9.0pt;font-family:Consolas"> 43,068 -2 sroa Maximum number of uses of a partition<u></u><u></u></span></p>
<p class="MsoNormal" style="color:rgb(0,0,0);font-family:Calibri,sans-serif;font-size:14px">
<span style="font-size:9.0pt;font-family:Consolas"> 11,754,901 -180 sroa # alloca partition uses rewritten<u></u><u></u></span></p>
<p class="MsoNormal" style="color:rgb(0,0,0);font-family:Calibri,sans-serif;font-size:14px">
<span style="font-size:9.0pt;font-family:Consolas"> 4,623,115 -11 sroa # alloca partitions formed<u></u><u></u></span></p>
<p class="MsoNormal" style="color:rgb(0,0,0);font-family:Calibri,sans-serif;font-size:14px">
<span style="font-size:9.0pt;font-family:Consolas"> 5,927,727 -11 sroa # allocas analyzed for replacement<u></u><u></u></span></p>
<p class="MsoNormal" style="color:rgb(0,0,0);font-family:Calibri,sans-serif;font-size:14px">
<span style="font-size:9.0pt;font-family:Consolas"> 4,576,406 -5 sroa # allocas promoted to SSA values<u></u><u></u></span></p>
<p class="MsoNormal" style="color:rgb(0,0,0);font-family:Calibri,sans-serif;font-size:14px">
<span style="font-size:9.0pt;font-family:Consolas"> 13,770,636 -227 sroa # instructions deleted<u></u><u></u></span></p>
<p class="MsoNormal" style="color:rgb(0,0,0);font-family:Calibri,sans-serif;font-size:14px">
<span style="font-size:9.0pt;font-family:Consolas"> 3,797 -1 strip-dead-prototypes # dead prototypes removed<u></u><u></u></span></p>
<p class="MsoNormal" style="color:rgb(0,0,0);font-family:Calibri,sans-serif;font-size:14px">
<span style="font-size:9.0pt;font-family:Consolas"> </span></p>
<p class="MsoNormal" style="color:rgb(0,0,0);font-family:Calibri,sans-serif;font-size:14px">
<span style="font-size:9.0pt;font-family:Consolas"> </span></p>
</div>
<div style="color:rgb(0,0,0);font-family:Calibri,sans-serif;font-size:14px"><br>
</div>
<div style="color:rgb(0,0,0);font-family:Calibri,sans-serif;font-size:14px"><br>
</div>
<div style="color:rgb(0,0,0);font-family:Calibri,sans-serif;font-size:14px"><br>
</div>
<span style="color:rgb(0,0,0);font-family:Calibri,sans-serif;font-size:14px">
<div style="font-family:Calibri;font-size:11pt;text-align:left;color:black;BORDER-BOTTOM:medium none;BORDER-LEFT:medium none;PADDING-BOTTOM:0in;PADDING-LEFT:0in;PADDING-RIGHT:0in;BORDER-TOP:#b5c4df 1pt solid;BORDER-RIGHT:medium none;PADDING-TOP:3pt">
<span style="font-weight:bold">From: </span>Daniel Berlin <<a href="mailto:dberlin@dberlin.org" target="_blank">dberlin@dberlin.org</a>><br>
<span style="font-weight:bold">Date: </span>Thursday, August 25, 2016 at 9:06 AM<br>
<span style="font-weight:bold">To: </span>David Callahan <<a href="mailto:dcallahan@fb.com" target="_blank">dcallahan@fb.com</a>><br>
<span style="font-weight:bold">Cc: </span>George Burgess IV <<a href="mailto:george.burgess.iv@gmail.com" target="_blank">george.burgess.iv@gmail.com</a>>, LLVM Dev Mailing list <<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a>><br>
<span style="font-weight:bold">Subject: </span>Re: [llvm-dev] CFLAA<br>
</div>
<div>
<div>
<div><br>
</div>
<div>
<div>
<div dir="ltr">Hey David,
<div>I'll take a look at the patch :)</div>
<div>Sounds like fun work.</div>
<div><br>
</div>
<div>As George says, improving AA significantly will almost always cause significant performance regressions at first, in almost any compiler.</div>
<div><br>
</div>
<div>Compilers knobs, passes, usually get tuned for x amount of freedom, and if you give them 10x, they start moving things too far, vectorizing too much, spilling, etc. </div>
<div><br>
</div>
<div>This was definitely the case for GCC, where adding a precise interprocedural field-sensitive analysis initially regressed performance by a few percent on average.</div>
<div><br>
</div>
<div>I know it was also the case for XLC at IBM, etc.</div>
<div><br>
</div>
<div>Like anything else, just gotta figure out what passes are going nuts, and rework them to have better heuristics/etc.</div>
<div>The end result is performance improvements, but the path takes a bit of time.</div>
<div><br>
</div>
<div>If you need a way to see whether your analysis has actually done an okay job in the meantime, usually a good way to see if you are doing well or not is to see how many loads/stores get eliminated or moved by various passes before and after.</div>
<div><br>
</div>
<div>If the number is significantly higher, great.</div>
<div>If the number is significantly lower, something has likely gone wrong :)</div>
<div><br>
</div>
</div>
<div class="gmail_extra"><br>
<div class="gmail_quote">On Thu, Aug 25, 2016 at 8:11 AM, David Callahan via llvm-dev
<span dir="ltr"><<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div style="word-wrap:break-word;color:rgb(0,0,0);font-size:14px;font-family:Calibri,sans-serif">
<div>(Adding “LLVM Dev”)</div>
<div><br>
</div>
<div>My variant is up as <a href="https://urldefense.proofpoint.com/v2/url?u=https-3A__reviews.llvm.org_D23876&d=DQMFaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=lFyiPUrFdOHdaobP7i4hoA&m=3IIr_u9iBJMmiJs5esz2CusHub4rwjMYvjBstOaOQTQ&s=w5NvhJ0O9-ynWwh32R64KxDnRJN4Mv9OxUgD44L1GSI&e=" target="_blank">https://reviews.llvm.org/D2<wbr>3876</a></div>
<div>—david</div>
<div><br>
</div>
<div><br>
</div>
<span>
<div style="font-family:Calibri;font-size:11pt;text-align:left;color:black;BORDER-BOTTOM:medium none;BORDER-LEFT:medium none;PADDING-BOTTOM:0in;PADDING-LEFT:0in;PADDING-RIGHT:0in;BORDER-TOP:#b5c4df 1pt solid;BORDER-RIGHT:medium none;PADDING-TOP:3pt">
<span style="font-weight:bold">From: </span>George Burgess IV <<a href="mailto:george.burgess.iv@gmail.com" target="_blank">george.burgess.iv@gmail.com</a>><br>
<span style="font-weight:bold">Date: </span>Wednesday, August 24, 2016 at 3:17 PM<br>
<span style="font-weight:bold">To: </span>David Callahan <<a href="mailto:dcallahan@fb.com" target="_blank">dcallahan@fb.com</a>><br>
<span style="font-weight:bold">Subject: </span>Re: CFLAA<br>
</div>
<div><br>
</div>
<div>
<div>
<div dir="ltr">
<div>Hi!</div>
<div><br>
</div>
<div>> I see there is on going work with alias analysis and it appears the prior CFLAA has been abandoned.<br>
</div>
<div><br>
</div>
<div>There was quite a bit of refactoring done, yeah. The original CFLAA is now called CFLSteens, and graph construction was moved to its own bit. We also have CFLAnders, which is based more heavily on the paper by Zheng and Rugina (e.g. no stratifiedsets magic).</div>
<div><br>
</div>
<div>> I have a variant of it where I reworked how compression was done to be less conservative, reworked the interprocedural to do simulated but bounded inlining, and added code to do on-demand testing of CFL paths on both compressed and full graphs.</div>
<div><br>
</div>
<div>Awesome!</div>
<div><br>
</div>
<div>> Happy to share the patch with you if you are interested as well as some data collected</div>
<div><br>
</div>
<div>Yes, please. Would you mind if I CC'ed llvm-dev on this thread (and a few people specifically, who also might find this interesting)?</div>
<div><br>
</div>
<div>> However I was not able to see any performance improvements in the code. In fact on a various benchmarks there were noticeable regressions in measured performance of the generated code. Have you noticed any similar problems?</div>
<div><br>
</div>
<div>I know that a number of people people in the community expressed concerns about how other passes will perform with better AA results (e.g. If LICM becomes more aggressive, register pressure may increase, which may cause us to spill when we haven't before,
etc). So, such a problem isn't unthinkable. :)</div>
<div class="gmail_extra"><br>
<div class="gmail_quote">On Wed, Aug 24, 2016 at 2:56 PM, David Callahan <span dir="ltr">
<<a href="mailto:dcallahan@fb.com" target="_blank">dcallahan@fb.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div style="word-wrap:break-word;color:rgb(0,0,0);font-size:14px;font-family:Calibri,sans-serif">
<div>
<p class="MsoNormal">Hi Greg,<u></u><u></u></p>
<p class="MsoNormal"><u></u> <u></u></p>
<p class="MsoNormal">I see there is on going work with alias analysis and it appears the prior CFLAA has been abandoned.
<u></u><u></u></p>
<p class="MsoNormal"><u></u> <u></u></p>
<p class="MsoNormal">I have a variant of it where I reworked how compression was done to be less conservative, reworked the interprocedural to do simulated but bounded inlining, and added code to do on-demand testing of CFL paths on both compressed and full
graphs. <u></u><u></u></p>
<p class="MsoNormal"><u></u> <u></u></p>
<p class="MsoNormal">I reached a point where the ahead-of-time compression was linear but still very accurate compared to on-demand path search and there were noticeable improvements in the alias analysis results and impacted transformations. Happy to share
the patch with you if you are interested as well as some data collected.<u></u><u></u></p>
<p class="MsoNormal"><u></u> <u></u></p>
<p class="MsoNormal">However I was not able to see any performance improvements in the code. In fact on a various benchmarks there were noticeable regressions in measured performance of the generated code. Have you noticed any similar problems?<span><font color="#888888"><u></u><u></u></font></span></p>
<span><font color="#888888">
<p class="MsoNormal"><u></u> <u></u></p>
<p class="MsoNormal">--david<u></u><u></u></p>
</font></span></div>
</div>
</blockquote>
</div>
<br>
</div>
</div>
</div>
</div>
</span></div>
<br>
______________________________<wbr>_________________<br>
LLVM Developers mailing list<br>
<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a><br>
<a href="https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.llvm.org_cgi-2Dbin_mailman_listinfo_llvm-2Ddev&d=DQMFaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=lFyiPUrFdOHdaobP7i4hoA&m=3IIr_u9iBJMmiJs5esz2CusHub4rwjMYvjBstOaOQTQ&s=tNsOAenrwSMjlSuk3LT6kVtwhCkamHx4Et1smBmoCXQ&e=" rel="noreferrer" target="_blank">http://lists.llvm.org/cgi-bin/<wbr>mailman/listinfo/llvm-dev</a><br>
<br>
</blockquote>
</div>
<br>
</div>
</div>
</div>
</div>
</div>
</span></div>
</blockquote>
</div>
<br>
</div>
</div>
</div>
</blockquote>
</div>
<br>
</div>
</div>
</div>
</div></div></span>
</div>
</blockquote></div><br></div>