r198640 - CodeGen: Initial instrumentation based PGO implementation

Justin Bogner mail at justinbogner.com
Mon Jan 13 13:42:33 PST 2014


Eric Christopher <echristo at gmail.com> writes:
> Hi Justin,
>
> Is it possible to separate out the "instrument to get counters" from "annotate
> based on information" parts of the PGO class?

I like this idea as well, but there's enough overlap between the two
parts that it's a bit tricky to do in a way that actually makes things
better. I'll keep this in mind as I make further changes.

> It would be great to see a top level description of the algorithm you're using
> to do region creation etc along with an example. It seems that you're creating
> regions from a way different than most previous literature on the topic so a
> literature link would be good if you've got one, else a description is going
> to be key.

I am planning on providing a write up about all of this, but I
obviously haven't done so yet. Rest assured, I haven't forgotten.

> Few comments on the code:
>
>     +  PGO.setCurrentRegionCount(0);
>
> Should comment this as it's not obvious from reading the surrounding code.

I've changed these uses to a method called setCurrentRegionUnreachable
in r199138. Hopefully that makes things a little more obvious.

>     +  Cnt.beginRegion(Builder);
>        eval.begin(*this);
>        LValue lhs = EmitLValue(expr->getTrueExpr());
>        eval.end(*this);
>     +  Cnt.adjustFallThroughCount();
>
> The adjust stuff is wonky enough that you should comment all uses of it. It
> isn't obvious why/when/where you'd want to do it. Same with when you apply
> adjustments.
>
> Basically if it's going to be a part of any abnormal edge then perhaps it
> makes more sense to handle it as part of abnormal edge calculations? What
> about the code needs this sort of handling? Ditto with the parent count
> subtractions you see at loops.

That every use needs documentation makes me think that this is poorly
named an documented. I've renamed it to adjustForControlFlow and
improved the docs in r199139. Let me know if you still think we need
more local comments.

>
>     +    PGO.setCurrentRegionCount(0);
>
> Go ahead document why you're resetting the region count here.
>  
>
>        // As long as the condition is true, iterate the loop.
>     -  if (EmitBoolCondBranch)
>     -    Builder.CreateCondBr(BoolCondVal, LoopBody, LoopExit.getBlock());
>     +  if (EmitBoolCondBranch) {
>     +    Builder.CreateCondBr(BoolCondVal, LoopBody, LoopExit.getBlock(),
>     +                         PGO.createBranchWeights(LoopCount, ExitCount));
>     +  }
>
> Don't think the braces are necessary here.

Right. r199136.

>     +    uint64_t Total = CaseCnt.getCount() - CaseCnt.getParentCount();
>     +    unsigned NCases = Range.getZExtValue() + 1;
>     +    // Divide the weights evenly between the cases, ensuring that the
>     total
>     +    // weight is preserved. Ie, a weight of 5 over three cases will be
>     +    // distributed as weights of 2, 2, and 1.
>     +    uint64_t Weight = Total / NCases, Rem = Total % NCases;
>     +    for (unsigned I = 0; I != NCases; ++I) {
>     +      if (SwitchWeights)
>     +        SwitchWeights->push_back(Weight + (Rem ? 1 : 0));
>     +      if (Rem)
>     +        Rem--;
>
> Is this related to propagating some difference between parent counts and
> "normal" counts?

Not at all! This is because we only have one counter for the case
statement range, but we need weights for each case. I've tried to make
that a little clearer in the comment in r199140.

>
>     -  // If the body of the case is just a 'break', and if there was no
>     fallthrough,
>     -  // try to not emit an empty block.
>     -  if ((CGM.getCodeGenOpts().OptimizationLevel > 0) &&
>     +  // If the body of the case is just a 'break', try to not emit an empty
>     block.
>     +  // If we're profiling or we're not optimizing, leave the block in for
>     better
>     +  // debug and coverage analysis.
>     +  if (!CGM.getCodeGenOpts().ProfileInstrGenerate &&
>     +      CGM.getCodeGenOpts().OptimizationLevel > 0 &&
>            isa<BreakStmt>(S.getSubStmt())) {
>          JumpDest Block = BreakContinueStack.back().BreakBlock;
>
> Is it really worth that much for code gen/optimization time to not just always
> have it? Also, do we really want instrumentation to change the cfg we emit?
> Also, it doesn't change in debug mode, just without optimization so that part
> of the comment isn't quite accurate :)

Well, we can't elide the block if we're profiling, since it won't be
empty in that case (the instrumentation code has to go somewhere!). It
looks like the original change to skip the empty block went in to clean
up -O0 CodeGen (r129652) but that made things messier for debugging and
for gcov (PR9796) so conditionalizing on OptimizationLevel > 0 was added
in r154420.

Presumably at anything above -O0 the backend will remove the empty block
anyway, so it's probably safe to remove this logic completely. Thoughts?

> There's probably a bit more, but this seems like a good place to start given
> the questions at the top.
>
> -eric




More information about the cfe-commits mailing list