Add 'cold' attribute to functions

Chandler Carruth chandlerc at google.com
Fri May 24 08:44:22 PDT 2013


Catching up on this thread, and just wanted to point out...

On Fri, May 24, 2013 at 5:25 AM, Diego Novillo <dnovillo at google.com> wrote:

> On 2013-05-24 02:10 , Duncan Sands wrote:
>
>>
>> it's not necessary for correctness, it's an optimization.  It simply
>> allocates
>> TI->getNumSuccessors() slots in the vector at once, rather than
>> incrementally
>> resizing the vector as elements are added.  Using ->reserve doesn't
>> change the
>> vector semantics in any way, for example after the ->reserve call the
>> vector
>> will still report its size as being zero.  It's not clear in this case if
>> doing
>> this is a win, since the TI->getNumSuccessors() elements are distributed
>> between
>> two vectors not one.
>>
>
> Sure, I realize that.  I'm just wondering whether it'ss worth it. On the
> one hand, I don't expect the fan out of a basic block to ever be too large,
> and on the other, I expected push_back to resize the vector exponentially.
>  However, it doesn't really hurt, so I just did the change.


I actually thought of this when I helped Diego with the initial version of
the patch (and I thought of it before when i wrote the code that this is
patterned after in the other part of this file). However, I don't think
reserve is correct here.

As Duncan mentioned, the successors are distributed between two vectors,
not one. Specifically, the overwhelming majority of cases none of the
successors are cold. The second most common pattern is that all of the
successors are cold. Somewhere below that (statistically) is the case where
only *some* of the edges are cold. So this is almost always going to
reserve 2x more memory than we actually need.

Now, none of this really matters in the common case -- the number of
successors is 2. However, in that case the reserve is a no-op, so why
bother?

So, maybe we have lots of switches in our code. What is the skew then? If
anything, I would say, the break down remains:

80% -> no cold successors
10% -> all cold successors
9% -> exactly one cold successor
1% -> everything else

So, reserving both is again almost always (99%) going to allocate 2x the
memory we need.

I would suggest removing the reserve here. I think it this is a premature
optimization of the typical kind -- we don't even really know if it would
help, and it might actually hurt.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20130524/84456de4/attachment.html>


More information about the llvm-commits mailing list