[PATCH] Propagation of profile samples through the CFG.

Fri Jan 10 11:59:47 PST 2014

On Thu, Jan 9, 2014 at 7:35 AM, Chandler Carruth <chandlerc at gmail.com> wrote:

> It was not clear to me at first that this is an inversion flag. I'd like a better name, but I don't have any particular ideas. I wonder if it would be more clear by passing in the domtree pointer to check?

Done. I changed the parameter to pass a DominatorTreeBase pointer. The
caller passes DT->DT or PDT->DT accordingly.

> ================
> Comment at: lib/Transforms/Scalar/SampleProfile.cpp:563-565
> @@ +562,5 @@
> +    BasicBlock *BB2 = *I;
> +    bool IsDomParent =
> +        (CheckDT) ? DT->dominates(BB2, BB1) : PDT->dominates(BB2, BB1);
> +    bool IsInSameLoop = LI->getLoopFor(BB1) == LI->getLoopFor(BB2);
> +    if (BB1 != BB2 && VisitedBlocks.insert(BB2) && IsDomParent &&
> ----------------
> I suspect there is a more direct way to compute this, but I don't want to hold up the review thinking about it -- maybe a FIXME?
>
> Specifically, I think that we might be able to phrase this purely in terms of LoopInfo and DomTree.

Sorry, I'm not following. This is expressed in terms of loop info and
dominance. What do you want to change?

> ================
> Comment at: lib/Transforms/Scalar/SampleProfile.cpp:578-581
> @@ +577,6 @@
> +      // members of its equivalence set.
> +      uint32_t &bb1_weight = BlockWeights[BB1];
> +      uint32_t &bb2_weight = BlockWeights[BB2];
> +      if (bb2_weight > bb1_weight)
> +        bb1_weight = bb2_weight;
> +    }
> ----------------
> bb1_weight = std::max(bb1_weight, bb2_weight);

D'oh! Done.

> Comment at: lib/Transforms/Scalar/SampleProfile.cpp:620
> @@ +619,3 @@
> +    // class by making BB2's equivalence class be BB1.
> +    SmallVector<BasicBlock *, 8> DominatedBBs;
> +    DT->getDescendants(BB1, DominatedBBs);
> ----------------
> You should probably put this outside the loop.

That would be wrong. BB1 changes with every iteration of the loop.

>
> ================
> Comment at: lib/Transforms/Scalar/SampleProfile.cpp:624-625
> @@ +623,4 @@
> +
> +    // Repeat the same logic for all the blocks post-dominated by BB1.
> +    // We are looking for every basic block BB2 such that:
> +    //
> ----------------
> Why do we need to do this?
>
> If we visit every block as BB1, and then visit every block BB1 dominates as BB2, we will have considered all pairs of blocks where A dominates B, and tested for B post-dominating A. We don't need to consider pairs of blocks where B post-dominates A because we will reject the ones where A does not dominate B.

Note that we do not necessarily visit every block as BB1. We only
visit the blocks for which we have equivalence class information. This
means that initially, only those blocks with samples in them have been
assigned into an equivalence class.

So, if in a if-then-else diamond, you only have samples inside the
'then' basic block, you need to look up from the 'then' to be able to
propagate into the 'if'. Using just dominance properties for 'then'
will not allow you to propagate its weight.

>
> ================
> Comment at: lib/Transforms/Scalar/SampleProfile.cpp:578
> @@ +577,3 @@
> +      // members of its equivalence set.
> +      uint32_t &bb1_weight = BlockWeights[BB1];
> +      uint32_t &bb2_weight = BlockWeights[BB2];
> ----------------
> Chandler Carruth wrote:
>> bb1_weight = std::max(bb1_weight, bb2_weight);
> Naming convention here.

OK, though I'm pretty confused. Your line_iterator patch uses this
convention. And, in fact, I've seen _ used as separator in several
places. Do we have different conventions for different parts of the
compiler? The coding guidelines don't seem to say anything about
having different conventions.

>
> ================
> Comment at: lib/Transforms/Scalar/SampleProfile.cpp:940
> @@ +939,3 @@
> +///
> +///    - If all the edges are known except one, and the weight of the
> +///      block is already known, the weight of the unknown edge will
> ----------------
> Incoming edges or outgoing edges? Or both?

Incoming or outgoing. Clarified.

> ================
> Comment at: lib/Transforms/Scalar/SampleProfile.cpp:937-938
> @@ +936,4 @@
> +///
> +///    - If B has a single predecessor/successor, then the weight
> +///      of that edge is the weight of the block.
> +///
> ----------------
> How do you handle contradictions?

>From user annotations? I would give a warning, if I could. Right now,
it's a silent override.

> ================
> Comment at: lib/Transforms/Scalar/SampleProfile.cpp:951-953
> @@ +950,5 @@
> +///
> +/// Since this propagation is not guaranteed to finalize for every CFG, we
> +/// only allow it to proceed for a limited number of iterations (controlled
> +/// by -sample-profile-max-propagate-iterations).
> +///
> ----------------
> This is a really frustrating property. Can it be easily avoided?

Not without changing the algorithm itself.

> The first of the above steps is obviously deterministic.
>
> For the second and third, I feel like the following is guaranteed to make forward progress and terminate:
> 0) Add all blocks to a list to process.
> 1) Take the first block out of the list
> 2) if we can add a weight to an edge connected to this block (including a self referential edge) do so and add the other block the edge connects to to the list (if not already in the list)
> 3) repeat #2 until we cannot add a weight to any edge connected to this block
> 4) if the list is not empty, go to #1
> We only add a block to the list if we add a weight to an edge, so we can at most visit V+E blocks.

Yes, I thought about a similar scheme (in fact, mimicking standard
value propagation). But this changes the algorithm. Which is something
I was trying to avoid at first.

Diego.