[PATCH] Propagation of profile samples through the CFG.

Diego Novillo dnovillo at google.com
Tue Nov 26 10:15:21 PST 2013


Hi chandlerc,

    This adds a propagation heuristic to convert instruction samples
    into branch weights. It implements a similar heuristic to the one
    implemented by Dehao Chen on GCC.

    The propagation proceeds in 3 phases:

    1- Assignment of block weights. All the basic blocks in the function
       are initial assigned the same weight as their most frequently
       executed instruction.

    2- Creation of equivalence classes. Since samples may be missing from
       blocks, we can fill in the gaps by setting the weights of all the
       blocks in the same equivalence class to the same weight. To compute
       the concept of equivalence, we use dominance and loop information.
       Two blocks B1 and B2 are in the same equivalence class if B1
       dominates B2, B2 post-dominates B1 and both are in the same loop.

    3- Propagation of block weights into edges. This uses a simple
       propagation heuristic. The following rules are applied to every
       block B in the CFG:

       - If B has a single predecessor/successor, then the weight
         of that edge is the weight of the block.

       - If all the edges are known except one, and the weight of the
         block is already known, the weight of the unknown edge will
         be the weight of the block minus the sum of all the known
         edges. If the sum of all the known edges is larger than B's weight,
         we set the unknown edge weight to zero.

       - If there is a self-referential edge, and the weight of the block is
         known, the weight for that edge is set to the weight of the block
         minus the weight of the other incoming edges to that block (if
         known).

    Since this propagation is not guaranteed to finalize for every CFG, we
    only allow it to proceed for a limited number of iterations (controlled
    by -sample-profile-max-propagate-iterations). It currently uses the same
    GCC default of 100.

    Before propagation starts, the pass builds (for each block) a list of
    unique predecessors and successors. This is necessary to handle
    identical edges in multiway branches. Since we visit all blocks and all
    edges of the CFG, it is cleaner to build these lists once at the start
    of the pass.

    Finally, the patch fixes the computation of relative line locations.
    The profiler emits lines relative to the function header. To discover
    it, we traverse the compilation unit looking for the subprogram
    corresponding to the function. The line number of that subprogram is the
    line where the function begins. That becomes line zero for all the
    relative locations.

    The propagator needs some tweaks. I've got some test cases that are still not handled properly.
    I am sending new test cases shortly.  I want to have an initial version of the propagator that is
    not too far apart from the GCC implementation.

http://llvm-reviews.chandlerc.com/D2274

Files:
  lib/Transforms/Scalar/SampleProfile.cpp
  test/Transforms/SampleProfile/Inputs/propagate.prof
  test/Transforms/SampleProfile/branch.ll
  test/Transforms/SampleProfile/propagate.ll
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D2274.1.patch
Type: text/x-patch
Size: 49351 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20131126/ce4679d5/attachment.bin>


More information about the llvm-commits mailing list