[llvm-dev] Computing block profile weights

Tue Dec 6 11:41:02 PST 2016

Jonas, I assume you are talking about Sample-Based PGO. Yes, the problem
you mentioned exists -- and your proposed solution seems reasonable. +dehao
for comments.

David

On Fri, Dec 2, 2016 at 1:40 AM, Jonas Wagner via llvm-dev <
llvm-dev at lists.llvm.org> wrote:

> Hello,
>
> I'm working on an application that would benefit from knowing the weight
> of a basic block, as in "fraction of the program's execution time spent in
> this block".
>
> Currently, I'm computing this using the block's frequency from
> BlockFrequencyInfo, relative to the function's entry block frequency, and
> scaled by the function's entry count. This is also the computation that's
> done by getBlockProfileCount
> <https://github.com/llvm-mirror/llvm/blob/master/lib/Analysis/BlockFrequencyInfoImpl.cpp#L540> in
> lib/Analysis/BlockFrequencyInfoImpl.cpp.
>
> The problem is that this method can be extremely imprecise, because many
> functions have an entry count of zero. The entry count is computed from the
> number of profile samples in the entry block. Depending on the function's
> CFG, this count can be arbitrarily low even though the function is
> frequently called or hot.
>
> Here's an idea to address this. I'd like to collect a bit of feedback from
> the community before trying it out.
>
> 1) Instead of relying on a function's *entry count*, store the *total
> number of samples* in a function. This number is readily available from
> the profile loader.
> 2) Compute a block's weight as `function_samples * block_weight /
> sum_of_block_weights_in_function`
>
> Why do I like this?
>
> - Total samples in a function gives a good impression of the importance of
> a function, better than the entry count.
> - This scheme "preserves mass" in that all samples of a function are taken
> into account. The samples in a BB are compared to samples in the entire
> function, rather than a few (arbitrarily) selected samples from the entry
> block.
> - The computation avoids imprecision from multiplying by small numbers.
>
> Disadvantages?
>
> - BlockFrequencyInfo needs to keep track of the total frequency in a
> function.
> - BlockFrequencyInfo would probably scale the frequencies w.r.t. that
> total, rather then the maximum frequency. This loses a few bits of precision
>
> Note that the entry count would not be lost in this scheme; one could
> easily compute it as `function_samples * entry_weight /
> sum_of_block_weights_in_function`.
>
> I believe using an entire function as unit of reference is a good
> compromise between precision and modularity. Precision is high because
> there's a sufficient number of samples available in a function. Modularity
> is preserved because the computation does not need to take other functions
> into account (in fact, BlockFrequencyInfo already processes one function at
> at time).
>
> What do people think about this?
> - Jonas
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20161206/3b629ee1/attachment.html>