<html>
    <head>
      <base href="http://llvm.org/bugs/" />
    </head>
    <body><table border="1" cellspacing="0" cellpadding="8">
        <tr>
          <th>Bug ID</th>
          <td><a class="bz_bug_link 
          bz_status_NEW "
   title="NEW --- - Information loss and performance issues in branch probability representation"
   href="http://llvm.org/bugs/show_bug.cgi?id=22718">22718</a>
          </td>
        </tr>

        <tr>
          <th>Summary</th>
          <td>Information loss and performance issues in branch probability representation
          </td>
        </tr>

        <tr>
          <th>Product</th>
          <td>libraries
          </td>
        </tr>

        <tr>
          <th>Version</th>
          <td>trunk
          </td>
        </tr>

        <tr>
          <th>Hardware</th>
          <td>PC
          </td>
        </tr>

        <tr>
          <th>OS</th>
          <td>All
          </td>
        </tr>

        <tr>
          <th>Status</th>
          <td>NEW
          </td>
        </tr>

        <tr>
          <th>Severity</th>
          <td>normal
          </td>
        </tr>

        <tr>
          <th>Priority</th>
          <td>P
          </td>
        </tr>

        <tr>
          <th>Component</th>
          <td>Global Analyses
          </td>
        </tr>

        <tr>
          <th>Assignee</th>
          <td>dnovillo@google.com
          </td>
        </tr>

        <tr>
          <th>Reporter</th>
          <td>dnovillo@google.com
          </td>
        </tr>

        <tr>
          <th>CC</th>
          <td>davidxl@google.com, llvmbugs@cs.uiuc.edu
          </td>
        </tr>

        <tr>
          <th>Classification</th>
          <td>Unclassified
          </td>
        </tr></table>
      <p>
        <div>
        <pre>BranchProbabilityInfo, MachineBranchProbabilityInfo mimic the raw profile data
representation and represents branch probability using non-normalized weights.
We have found some issues where this representation loses information.

Scaling and capping of branch weights may lead to distorted probability for
very hot branches:

 - The weight for each branch is capped at UINT32_MAX/number_successors. The
max weight limit imposed is too small for long-running applications -- hot
branches can easily hit the cap.

 - When any branch weight hits the limit, the implementation does not actually
do any scaling on weights of other branches. For hot branches, this can cause a
highly biased branch to become non-biased. Cascading bad effects will also
occur in the frequency propagation stage due to the capping.

This happens in BranchProbabilityInfo::calcMetadataWeights, as well as
MachineBranchProbabilityInfo::getSumForBlock(const MachineBasicBlock *MBB,
uint32_t &Scale) const

Additionally, the API for getting branch probabilities has some performance
issues:

For many clients where getting relative branch hotness is enough, the client
can simply use getEdgeWeight interface. However if the client needs to know the
actual branch probability, getEdgeProbability is needed. To compute edge
probability, the implementation needs to compute getSumForBlock which needs to
walk through all successors. This has O(N^2) behavior when we need to compute
branch prob for each out going cfg edge.

Instead of representing branch probabilities using non-normalized weights, we
could represent probabilities directly using a fixed point representation with
scaling (between 0 and 1). The benefits would be:

1) conceptually simple. This would allow us to not rely on weight.
2) clients using getEdgeWeight are not affected. The weight can be derived from
getEdgeProb
3) The implementation for getEdgeProb becomes faster.


Finally, we could change the DenseMap hash key for edge probabilities.
Currently, edge probabilities are stored in a map using Pair<BasicBlock*,
SuccessIndex> as the key. Retrieving branch probability information is slow, as
it needs to walk through successors, find the match, and then use the successor
index to do the lookup. A solution that used pair<BasicBlock*, BasicBlock*> as
the key would avoid that.</pre>
        </div>
      </p>
      <hr>
      <span>You are receiving this mail because:</span>
      
      <ul>
          <li>You are on the CC list for the bug.</li>
      </ul>
    </body>
</html>