Hello again,<br>  I'm attaching another patch with some more modifications I made.<br>  Attachment: stprof-08.10.09.patch<br><br>  Below I comment some of the modifications. Some other minor modifications was introduced too.<br>


<br><div class="gmail_quote">On Mon, Sep 21, 2009 at 9:05 PM, Dan Gohman <span dir="ltr"><<a href="mailto:gohman@apple.com" target="_blank">gohman@apple.com</a>></span> wrote:<br>

<blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;"><div><br>

On Sep 20, 2009, at 8:35 AM, Andrei Alvares wrote:<br>

<br>

<blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">

The passes should override the Pass::releaseMemory virtual function.<br>

Some currently have a Clear member function; they should implement<br>

releaseMemory and have it call Clear, or perhaps Clear should be<br>

renamed releaseMemory, depending on how it's used.<br>

<br>

<br>

Done. But I would like to know more about this functionality. Consider the BranchPreditionPass class. I need to maintain edge predictions for subsequent passes (like BlockEdgeFrequencyPass that will consume it). If the pass manager calls the BranchPredictionPass::releaseMemory, should I free those edge predictions too? The attached patch is freeing it when releaseMemory is called.<br>


</blockquote>

<br></div>

The pass manager won't call releaseMemory until all passes that<br>

depend on the pass have been run, so releaseMemory should release<br>

everything.<div><br>

<br>

<blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">

<br>

BranchHeuristicsInfo::MatchCallHeuristic and some of the others<br>

don't test whether the terminator is a BranchInst. It looks like<br>

there's nothing preventing them from analyzing a block with an<br>

InvokeInst as its terminator. Is that intended by the heuristics?<br>

<br>

All the functions that process heuristics (Match{Some Heuristic}Heuristic) have a private visibility. The MatchHeuristic is a wrapper for those functions. There is a comment in the MatchHeuristic header claiming that it expects a BasicBlock with exactly two successors. The heuristics are only capable to handle blocks with two successors.<br>


<br>

I haven't add a check inside MatchHeuristic because I'm already ensuring this condition before it is called (on BranchPredictionPass). I was trying to avoid redundant checks. Do you believe that it should have the test anyway (inside BranchHeuristic)?<br>


</blockquote>

<br></div>

No, don't insert redundant checks. An assertion might be useful<br>

though.<div><br></div></blockquote><div><br>An assertion would require an execution of the test, right? I was avoiding testing it again. I haven't added yet.<br> <br></div><div></div><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">


<div>

<blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">

<br>

BranchHeuristicsInfo::postDominates calls<br>

PostDominatorTree::properlyDominates. Since the distinction between<br>

domination and proper domination is often both subtle and critical<br>

for algorithm correctness, please either rename this function or add<br>

comments or both.<br>

<br>

<br>

The algorithm requires post domination rather than properly post dominates. Take the call heuristic for example. If a block contains a call and a loop to itself, the heuristic should match. Properly post domination miss this and other cases.<br>


<br>

DominatorTree have "dominates" and "properlyDominates" functions. How come PostDominatorTree implemented only properlyDominates? That's way I felt the need to created my own.<br>

</blockquote>

<br></div>

I believe this is just because no one has needed it. Feel free<br>

to add it. It should be straight-forward; see DominatorTree's<br>

dominates function for example. The main work is done by the<br>

DominatorTreeBase class.<div><br></div></blockquote><div><br>I believe someone else  added the function "dominates" on PostDominatorTree. The static profiler patch is using it now. While I was testing this functionality, I've noticed a decreasing prediction accuracy. However, seems like my old way to calculate domination on post domination tree was incorrect, which was undesirable.<br>


<br></div><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;"><div>

<br>

<blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">

<br>

BranchPredictionInfo::FindBackEdges does a DominatorTree query for<br>

every edge in the function. It would be possible to get a close<br>

approximation of this using LoopInfo. That would miss unnatural<br>

loops and spaghetti code though; is that the reason for doing the<br>

full <span style="padding: 0pt; background-color: yellow; color: black; display: inline; font-size: inherit;">traversal</span>? Please add a comment about this.<br>

<br>

<br>

That's an interesting question. At first I thought that it would be possible, but after I've implemented the back edges considering only LoopInfo, the profiler shown a very low hit rate (20% worst in average). I've maintained the back edge calculation with domination tree.<br>


</blockquote>

<br></div>

That's surprising. It would be worth documenting this somewhere,<br>

perhaps in a comment.<div><br></div></blockquote><div><br>I've take another look into it, and I was able to approximate back edges using loop information afterall. Matter of fact, on all of the spec benchmark test, I was able to find exactly the same backedges calculated with DominatorTree. Using loop info, now I can calculate back edges and exit edges together, at the same time. I've also merged findStores and findCalls to avoid another full function traversal.<br>


<br></div><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;"><div>

<br>

<blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">

<br>

BranchPredictionInfo::FindExitEdges seems to be redundant with<br>

Loop::getExitEdges.<br>

<br>

<br>

Thanks for the tip. Unfortunately, I've tried to use the LoopInfo::getExitEdge in the branch prediction pass, but it is crashing. I can't tell for sure the root cause of the problem, but I believe that this function required that the loops are in simplified form, which is not what happens in some situations.<br>


</blockquote>

<br></div>

Hmm. A comment about this would be appropriate as well.<div><br></div></blockquote><div><br>I couldn't solve this yet. My implementation still brokes if I use LoopInfo::getExitEdges.<br><br></div><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">


<div>

<br>

<blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">

In its current form, BranchPredictionPass won't be usable in LLVM's<br>

default optimization pipeline because it requires post-dominator tree<br>

construction. None of LLVM's current default optimization passes<br>

require post-dominator trees, so constructing it just for this pass<br>

will likely be too slow. It could possibly be justified for -O3,<br>

if there are sufficient benefits to oughtweigh the costs. There are<br>

many LLVM users that don't use the standard -O2 or -O3 passes<br>

also, of course.<br>

<br>

<br>

Unfortunately, four of the nine implemented heuristics require post dominate tree information. The profiler would be very imprecise if those heuristics were lacking. Maybe when LLVM have more profile-guided passes, the benefits will out weight its costs.<br>


</blockquote>

<br></div>

Ok.<br><font color="#888888">

<br>

Dan<br>

<br>

</font></blockquote></div><br>