<div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote">On Wed, Oct 26, 2016 at 3:03 PM, Michael Kuperstein <span dir="ltr"><<a href="mailto:mkuper@google.com" target="_blank">mkuper@google.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote"><span>On Wed, Oct 26, 2016 at 1:09 PM, David Li <span dir="ltr"><<a href="mailto:davidxl@google.com" target="_blank">davidxl@google.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">davidxl added inline comments.<br>

<span><br>

<br>

================<br>

Comment at: lib/Transforms/Utils/LoopUnrol<wbr>lPeel.cpp:101<br>

+      // We no longer know anything about the branch probability.<br>

+      LatchBR->setMetadata(LLVMConte<wbr>xt::MD_prof, nullptr);<br>

+    }<br>

----------------<br>

</span><span>mkuper wrote:<br>

> davidxl wrote:<br>

> > Why? I think we should update the branch probability here -- it depends on the what iteration of the peeled clone. If peel count < average/estimated trip count, then each peeled iteration should be more biased towards fall through. If peel_count == est trip_count, then the last peel iteration should be biased toward exit.<br>

> You're right, it's not that we don't know anything - but we don't know enough. I'm not sure how to attach a reasonable number to this, without knowing the distribution.<br>

> Do you have any suggestions? The trivial option would be to assume an extremely narrow distribution (the loop always exits after exactly K iterations), but that would mean having an extreme bias for all of the branches, and I'm not sure that's wise.<br>

</span>A reasonable way to annotate the branch is like this.<br>

Say the original trip count of the loop is N, then for the m th (from 0 to N-1) peeled iteration, the fall through probability is a decreasing function:<br>

<br>

(N - m )/N<br>

<br></blockquote><div><br></div></span><div>I'm not entirely sure the math works out - because N is the average</div></div></div></div></blockquote><div><br></div><div>Yes -- N is the average -- but this is due to limitation of PGO. To get trip count distribution, we need to do value profiling of loop trip count or have path sensitive profile. This is future work. For now we need to focus on what we have with good heuristics. </div><div><br></div><div>With current PGO, the back branch probability is already estimated to be N/(N+1) which can be inaccurate depending on trip count distribution.</div><div><br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div> the newly assigned weights ought to have the property that the total probability of reaching the loop header is 0.5 -</div></div></div></div></blockquote><div><br></div><div><br></div><div>Why should this constraints exist? The constraints that should be satisfied are 1) the total frequency of the loop exit remain unchanged; 2) the total header (including cloned ones) frequency equals the original header frequency 3) the header frequency of the first peeled iteration equals to the original preheader frequency</div><div><br></div><div><br></div><div> The original conditional branch (for loop back edge) have one shared/'average' branch probability for iterations. Once the branch is cloned via peeling, more context (temporal) information is available, the conditional branch probabilities  of those cloned branches can be refined -- the intuition is that the closer the iteration is to the end of the loop, the more likely it is branch to exit.</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div> and I don't think that happens here.</div></div></div></div></blockquote><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div><br></div><div>This also doesn't solve the problem of what probability to assign to the loop backedge - if K is the random variable signifying the number of iterations, I think it should be something like 1/(E[K | K > E[K]] - E[K]).</div><div>That is, it depends on the expected number of iterations given that we have more iterations than average. Which we don't know, and we can't even bound.<br></div><div>E.g. imagine that we have a loop that runs for 1 iteration for a million times, and a million iterations once. The average number of iterations is 2, but the probability of taking the backedge, once you've reached the loop, is extremely high.</div></div></div></div></blockquote><div><br></div><div>We just need to update the existing branch weights data slightly. Ideally, we can first assign branch probabilities for conditional branches of cloned iterations, and then using the constraints I mentioned above to adjust the weight. However I think it can be simplified as follows:</div><div><br></div><div>Suppose the branch weight vector is (WB, WE) where WB is the weight of edge to loop header, and WE is weight of edge to exit block, then the new weight can be something like (WB - m*WE, WE) where m is the number of peeled iterations.</div><div><br></div><div>[Proof]. Assuming the fall_through probabilities of i th cloned cond branch is P_i.    The weight header of the first cloned iteration is WE,  then the total edge weight from cloned iteration to the exit block is</div><div><br></div><div>  WE *(1-P_1)*(1-P_2)...(1-P_m).  </div><div><br></div><div>so the new exit edge weight of the remaining loop is</div><div><br></div><div>WE * ( 1 - (1-P_1)*....(1-P_m))</div><div><br></div><div>Assuming P_i is close to 1, this approximates to WE.</div><div><br></div><div>Similarly, the new header weight of the loop is about (WB - m*WE)</div><div><br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div><br></div><div>We could assume something like a uniform distribution between, say, 0 and 2 * N iterations (in which case the fall-through probability is, I think (2 * N - m - 1) / (2 * N), and the backedge probability is something like 1 - 1/(1.5 * N) )  - but I don't know if that's realistic either.</div></div></div></div></blockquote><div><br></div><div>I am not sure making such assumption about distribution is a reasonable thing to do.  I think it is more reasonable to assume more narrow distribution and adjust the weight in a simple way (we are not doing anything worse than is already happening today). </div><div><br></div><div>David</div><div><br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><span><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

Add some fuzzing factor to avoid creating extremely biased branch prob:<br>

<br>

for instance (N-m)*3/(4*N)<br>

<br>

<br>

<a href="https://reviews.llvm.org/D25963" rel="noreferrer" target="_blank">https://reviews.llvm.org/D2596<wbr>3</a><br>

<br>

<br>

<br>

</blockquote></span></div><br></div></div>

</blockquote></div><br></div></div>