[llvm-commits] Static Profile Patch

Fri Oct 9 12:47:05 PDT 2009

Hello again,
  I'm attaching another patch with some more modifications I made.
  Attachment: stprof-08.10.09.patch

  Below I comment some of the modifications. Some other minor modifications
was introduced too.

On Mon, Sep 21, 2009 at 9:05 PM, Dan Gohman <gohman at apple.com> wrote:

>
> On Sep 20, 2009, at 8:35 AM, Andrei Alvares wrote:
>
>  The passes should override the Pass::releaseMemory virtual function.
>> Some currently have a Clear member function; they should implement
>> releaseMemory and have it call Clear, or perhaps Clear should be
>> renamed releaseMemory, depending on how it's used.
>>
>>
>> Done. But I would like to know more about this functionality. Consider the
>> BranchPreditionPass class. I need to maintain edge predictions for
>> subsequent passes (like BlockEdgeFrequencyPass that will consume it). If the
>> pass manager calls the BranchPredictionPass::releaseMemory, should I free
>> those edge predictions too? The attached patch is freeing it when
>> releaseMemory is called.
>>
>
> The pass manager won't call releaseMemory until all passes that
> depend on the pass have been run, so releaseMemory should release
> everything.
>
>
>> BranchHeuristicsInfo::MatchCallHeuristic and some of the others
>> don't test whether the terminator is a BranchInst. It looks like
>> there's nothing preventing them from analyzing a block with an
>> InvokeInst as its terminator. Is that intended by the heuristics?
>>
>> All the functions that process heuristics (Match{Some Heuristic}Heuristic)
>> have a private visibility. The MatchHeuristic is a wrapper for those
>> functions. There is a comment in the MatchHeuristic header claiming that it
>> expects a BasicBlock with exactly two successors. The heuristics are only
>> capable to handle blocks with two successors.
>>
>> I haven't add a check inside MatchHeuristic because I'm already ensuring
>> this condition before it is called (on BranchPredictionPass). I was trying
>> to avoid redundant checks. Do you believe that it should have the test
>> anyway (inside BranchHeuristic)?
>>
>
> No, don't insert redundant checks. An assertion might be useful
> though.
>

An assertion would require an execution of the test, right? I was avoiding
testing it again. I haven't added yet.

>
>> BranchHeuristicsInfo::postDominates calls
>> PostDominatorTree::properlyDominates. Since the distinction between
>> domination and proper domination is often both subtle and critical
>> for algorithm correctness, please either rename this function or add
>> comments or both.
>>
>>
>> The algorithm requires post domination rather than properly post
>> dominates. Take the call heuristic for example. If a block contains a call
>> and a loop to itself, the heuristic should match. Properly post domination
>> miss this and other cases.
>>
>> DominatorTree have "dominates" and "properlyDominates" functions. How come
>> PostDominatorTree implemented only properlyDominates? That's way I felt the
>> need to created my own.
>>
>
> I believe this is just because no one has needed it. Feel free
> to add it. It should be straight-forward; see DominatorTree's
> dominates function for example. The main work is done by the
> DominatorTreeBase class.
>

I believe someone else  added the function "dominates" on PostDominatorTree.
The static profiler patch is using it now. While I was testing this
functionality, I've noticed a decreasing prediction accuracy. However, seems
like my old way to calculate domination on post domination tree was
incorrect, which was undesirable.

>
>> BranchPredictionInfo::FindBackEdges does a DominatorTree query for
>> every edge in the function. It would be possible to get a close
>> approximation of this using LoopInfo. That would miss unnatural
>> loops and spaghetti code though; is that the reason for doing the
>> full traversal? Please add a comment about this.
>>
>>
>> That's an interesting question. At first I thought that it would be
>> possible, but after I've implemented the back edges considering only
>> LoopInfo, the profiler shown a very low hit rate (20% worst in average).
>> I've maintained the back edge calculation with domination tree.
>>
>
> That's surprising. It would be worth documenting this somewhere,
> perhaps in a comment.
>

I've take another look into it, and I was able to approximate back edges
using loop information afterall. Matter of fact, on all of the spec
benchmark test, I was able to find exactly the same backedges calculated
with DominatorTree. Using loop info, now I can calculate back edges and exit
edges together, at the same time. I've also merged findStores and findCalls
to avoid another full function traversal.

>
>> BranchPredictionInfo::FindExitEdges seems to be redundant with
>> Loop::getExitEdges.
>>
>>
>> Thanks for the tip. Unfortunately, I've tried to use the
>> LoopInfo::getExitEdge in the branch prediction pass, but it is crashing. I
>> can't tell for sure the root cause of the problem, but I believe that this
>> function required that the loops are in simplified form, which is not what
>> happens in some situations.
>>
>
> Hmm. A comment about this would be appropriate as well.
>

I couldn't solve this yet. My implementation still brokes if I use
LoopInfo::getExitEdges.

>  In its current form, BranchPredictionPass won't be usable in LLVM's
>> default optimization pipeline because it requires post-dominator tree
>> construction. None of LLVM's current default optimization passes
>> require post-dominator trees, so constructing it just for this pass
>> will likely be too slow. It could possibly be justified for -O3,
>> if there are sufficient benefits to oughtweigh the costs. There are
>> many LLVM users that don't use the standard -O2 or -O3 passes
>> also, of course.
>>
>>
>> Unfortunately, four of the nine implemented heuristics require post
>> dominate tree information. The profiler would be very imprecise if those
>> heuristics were lacking. Maybe when LLVM have more profile-guided passes,
>> the benefits will out weight its costs.
>>
>
> Ok.
>
> Dan
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20091009/bc86fb17/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: stprof-08.10.09.patch
Type: text/x-patch
Size: 99724 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20091009/bc86fb17/attachment.bin>