[PATCH] blockfreq: Rewrite block frequency analysis

Tue Mar 25 19:23:20 PDT 2014

First off, sorry this is now in two threads....

On Tue, Mar 25, 2014 at 5:57 PM, Duncan P. N. Exon Smith <
dexonsmith at apple.com> wrote:

> On Mar 25, 2014, at 2:06 PM, Chandler Carruth <chandlerc at google.com>
> wrote:
> > 2) Outside of the loop structure detection and management, how important
> are the RPO-traversal bits? Was it just convenient or really important to
> propagate weights in this way? This could be the real limitation of using
> LoopInfo -- it doesn't really preserve the RPO structures the way your code
> does.
>
> The RPO-traversal is important for distributing mass (you need everything
> to
> arrive at a node before you know how much to distribute to a successor).
>

Yea, that makes sense.

> Nevertheless, I’m also starting to feel skeptical about rolling my own,
> particularly because it’s complicated code that’s somewhat orthogonal to
> what
> I’m doing, so it really needs its own set of tests.
>
> I’d like to commit it as is, and then try to reuse LoopInfo before turning
> the
> new block frequency on.
>

And this makes a great deal of sense (similarly in the other thread). I
like the general plan here.

>
> > 3) I'm increasingly in favor of just using power-of-two loop scales. I
> actually can't come up with any use cases where distinguishing between 3
> and 4 as the "likely" loop trip count would matter. The only case I can see
> would be that rounding 3 down to 2 could have a bad effect in
> 3-iteration-heavy code such as graphics code, but it seems simpler to fix
> that directly by rounding 3 explicitly up to 4…
>
> Andy was worried about numbers up to 8.  It should be straightforward to
> hardcode the boundaries for the scales 3, 5, 6, and 7, and then use
> power-of-
> two scales for the rest.
>

Ok. I still don't see the importance of 5, 6, and 7... Maybe we could get
away with just special casing 3 if these special cases add much complexity?
Maybe they don't add much complexity.

The code will get even harder to look at, though.  To prevent the precision
> issues I was hitting originally, I’ll have to do some floating point-like
> stuff.  I’m fine either way, but it will be even harder for someone else to
> come and figure out what’s going on.
>

Could it be encapsulated in an even narrower utility which is simpler to
understand? I don't fully understand the impact of this, and I'm hesitant
to suggest writing the code both ways just to stare at it and throw one
away. And I can see how the PositiveFloat abstraction is in some senses an
easier thing to build and work with in the abstract.

I was hoping that the way this would play out would be to compute the loop
scales, and then just fall back on saturating integer math.

> 4) I'm having trouble with the mixture of terminology between mass,
> weight, and frequency. Do you have a mental model for the terminology you
> can add to the documentation? (Or did I miss it?)
>
> I do have a mental model; some of it was in the docs, but not all.  I’ll
> fix
> that.
>
> > I'm also still concerned about exposing both mass and frequency in the
> public API. What is the plan there?
>
> BlockMass was only exposed so that I could test
> BlockMass*BranchProbability.  I
> think the right solution is to move that into BranchProbability*uint64_t
> and test
> it there.
>
> I.e., I don’t think it should be exposed.

Ok. I think having the terminology clarified will also help me understand
the code.

The only thing that I think really needs to be sorted out (other than the
stuff you've clarified you're planning to do already) prior to commit is
the positive float issue. I'm just a bit hesitant to put that into Support
only to rip it back out. ;] But maybe that doesn't matter too much.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20140325/3a4beb27/attachment.html>