[PATCH] blockfreq: Rewrite block frequency analysis
chandlerc at google.com
Tue Mar 25 14:06:44 PDT 2014
On Wed, Mar 19, 2014 at 1:45 PM, Chandler Carruth <chandlerc at google.com>wrote:
> If so, does it
>> makes sense for me to post the current, massive, combined patch of
>> 0003+0004+0005+0006(+0007?) now, so you can look at the algorithm as a
> I'm actually interested in evaluating this totally independent of the bias
> part. I'm in complete agreement that the flow-solve aspect is orthogonal to
> what precise metric is useful for triggering different optimizations.
I've had some time to both read the patch and think about the specific
algorithm. A few high-level comments.
1) I'm pretty sure you can use LoopInfo. It won't be ideal and we could
probably make LoopInfo much better for your use case, but in particular the
problem of distinguishing between immediate loop blocks and nested loop
blocks seems reasonably easy to manage when working bottom-up -- you can
keep a set of the nested blocks built at each stage of the up-ward walk.
2) Outside of the loop structure detection and management, how important
are the RPO-traversal bits? Was it just convenient or really important to
propagate weights in this way? This could be the real limitation of using
LoopInfo -- it doesn't really preserve the RPO structures the way your code
3) I'm increasingly in favor of just using power-of-two loop scales. I
actually can't come up with any use cases where distinguishing between 3
and 4 as the "likely" loop trip count would matter. The only case I can see
would be that rounding 3 down to 2 could have a bad effect in
3-iteration-heavy code such as graphics code, but it seems simpler to fix
that directly by rounding 3 explicitly up to 4...
4) I'm having trouble with the mixture of terminology between mass, weight,
and frequency. Do you have a mental model for the terminology you can add
to the documentation? (Or did I miss it?) I'm also still concerned about
exposing both mass and frequency in the public API. What is the plan there?
As a somewhat separate note, I'm curious if you looked into directly
mapping this problem into mininum-cost flow network solutions along the
lines of this thesis?
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the llvm-commits