<div dir="ltr"><div class="gmail_extra"><br><div class="gmail_quote">On Wed, Mar 19, 2014 at 1:45 PM, Chandler Carruth <span dir="ltr"><<a href="mailto:chandlerc@google.com" target="_blank">chandlerc@google.com</a>></span> wrote:<br>

<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div class="gmail_quote"><div class=""><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">

<div style="overflow:hidden"> If so, does it<br>

makes sense for me to post the current, massive, combined patch of<br>

0003+0004+0005+0006(+0007?) now, so you can look at the algorithm as a whole?</div></blockquote></div></div><br>I'm actually interested in evaluating this totally independent of the bias part. I'm in complete agreement that the flow-solve aspect is orthogonal to what precise metric is useful for triggering different optimizations.</blockquote>

</div><br>I've had some time to both read the patch and think about the specific algorithm. A few high-level comments.</div><div class="gmail_extra"><br></div><div class="gmail_extra">1) I'm pretty sure you can use LoopInfo. It won't be ideal and we could probably make LoopInfo much better for your use case, but in particular the problem of distinguishing between immediate loop blocks and nested loop blocks seems reasonably easy to manage when working bottom-up -- you can keep a set of the nested blocks built at each stage of the up-ward walk.</div>

<div class="gmail_extra"><br></div><div class="gmail_extra">2) Outside of the loop structure detection and management, how important are the RPO-traversal bits? Was it just convenient or really important to propagate weights in this way? This could be the real limitation of using LoopInfo -- it doesn't really preserve the RPO structures the way your code does.</div>

<div class="gmail_extra"><br></div><div class="gmail_extra">3) I'm increasingly in favor of just using power-of-two loop scales. I actually can't come up with any use cases where distinguishing between 3 and 4 as the "likely" loop trip count would matter. The only case I can see would be that rounding 3 down to 2 could have a bad effect in 3-iteration-heavy code such as graphics code, but it seems simpler to fix that directly by rounding 3 explicitly up to 4...</div>

<div class="gmail_extra"><br></div><div class="gmail_extra">4) I'm having trouble with the mixture of terminology between mass, weight, and frequency. Do you have a mental model for the terminology you can add to the documentation? (Or did I miss it?) I'm also still concerned about exposing both mass and frequency in the public API. What is the plan there?</div>

<div class="gmail_extra"><br></div><div class="gmail_extra"><br></div><div class="gmail_extra">As a somewhat separate note, I'm curious if you looked into directly mapping this problem into mininum-cost flow network solutions along the lines of this thesis? <a href="http://www.cs.technion.ac.il/~royl/MscThesis_Final_Version_Submission.pdf">http://www.cs.technion.ac.il/~royl/MscThesis_Final_Version_Submission.pdf</a></div>

</div>