[PATCH] D10825: Improvement on computing edge probabilities when choosing the best successor in machine block placement.

Tue Jul 28 16:46:37 PDT 2015

> On 2015-Jul-28, at 16:26, Cong Hou <congh at google.com> wrote:
> 
> On Tue, Jul 28, 2015 at 4:20 PM, Chandler Carruth <chandlerc at gmail.com> wrote:
>> On Tue, Jul 28, 2015 at 3:58 PM Cong Hou <congh at google.com> wrote:
>>> 
>>> I found it not straightforward to normalize weights in MBPI. Note that
>>> MBPI reads weights from MBB, which provides the interface
>>> addSuccessor(MachineBasicBlock *succ, uint32_t weight) to let users indicate
>>> arbitrary weights for new added edges. Because we don't know how users use
>>> this interface, it can be difficult to adjust weights in MBB. I think there
>>> may be two solutions:
>>> 
>>> 1. We provide the interface normalizeWeights in MBB. Once we call this
>>> interface, the sum of weights is guaranteed to be less than UINT32_MAX.
>>> However, after the user add new successors with weights, we need to
>>> re-normalize it. This can still be useful when we want to calculate branch
>>> probabilities or do other adjustment after the weights are normalized.
>>> 
>>> 2. We maintained both normalized and original weights in MBB, so that once
>>> the original weights is modified, we also adjust the normalized weights.
>>> This solutions requires more space and more frequent calculations comparing
>>> to the first one, but the normalized weights are always guaranteed.
>>> 
>>> Do you have any other ideas? Or which one do you think is better?
>> 
>> 
>> First off, yuck, but thanks for digging into this.
>> 
>> 3. Another alternative might be to keep the sum stored, and when adding a
>> weight, we could simply check if that would overflow, if so, re-normalize
>> everything. It would require auditing code that adds a successor to ensure
>> it doesn't rely on the weights after adding a successor being the same, but
>> that seems unlikely to come up.
>> 
>> I'd particularly like to hear Duncan's thoughts on which of these options
>> would make most sense. I see pros and cons on all of these. I think I'd be
>> fine with either #1 or #3. I agree with your concern over the space
>> requirements of #2.

I like #3 the best, in theory, but I feel like #1 is an incremental
step in the right direction and is easier to design.

For #1, I would keep a sum so we *know* whether it's less than
`UINT32_MAX`, and then users that care about that can call the
`normalizeWeights()` API at an appropriate point and trust that the
operation is a no-op most of the time.

(One strange idea I had, if you want to add asserts in places: we
could add an "epoch", which is the number of times the weights have
been normalized.  Then the users can assert that they have the right
epoch.  I haven't really thought about whether this is a good idea (it
probably isn't) but I thought I'd put it down in case it's useful to
you.)

> Thanks for the suggestion! For #3 solution, what if the user first
> calculates weights for several successors and add them one by one? Do
> we need to notify the user how we normalize the weights so that the
> user can adjust the weights that are not added yet?

This is kind of ugly.  I think if we called `normalizeWeights()` in the
accessors (no-op most of the time) instead of the mutators, that would
solve this, but that makes the accessors non-const :(.

If you come up with a safe design and do the auditing work, then this
approach (#3 variant) SGTM, but #1 might be easier for now...