[LLVMdev] Loss of precision with very large branch weights

Fri Apr 24 11:32:44 PDT 2015

Since we have decided to capture the global hotness using function
entry count, it is ok to CAP the max weight as of today. I think the
remaining bug in PR22718 is that when capping happens, the scaling is
not done. That needs to be fixed.

The efficiency of the interface is a different issue.

David

On Fri, Apr 24, 2015 at 11:18 AM, Diego Novillo <dnovillo at google.com> wrote:
>
> In PR 22718, we are looking at issues with long running applications
> producing non-representative frequencies.  For example, in these two loops:
>
> int g = 0;
> __attribute__((noinline)) void bar() {
>  g++;
> }
>
> extern int printf(const char*, ...);
>
> int main()
> {
>   int i, j, k;
>
>   for (i = 0; i < 1000000; i++)
>     bar();
>
>   printf ("g = %d\n", g);
>   g = 0;
>
>   for (i = 0; i < 500000; i++)
>     bar();
>
>   printf ("g = %d\n", g);
> }
>
> I expect the loop body frequency for the second loop to be about half of the
> first one. This holds fine for this test case:
>
> $ bin/opt -analyze -block-freq -S unbiased-branches.ll
> Printing analysis 'Block Frequency Analysis' for function 'bar':
> block-frequency-info: bar
>  - entry: float = 1.0, int = 8
>
> Printing analysis 'Block Frequency Analysis' for function 'main':
> block-frequency-info: main
>  - entry: float = 1.0, int = 8
>  - for.cond: float = 500001.5, int = 4000011
>  - for.body: float = 500000.5, int = 4000003
>  - for.inc: float = 500000.5, int = 4000003
>  - for.end: float = 1.0, int = 8
>  - for.cond1: float = 250001.5, int = 2000011
>  - for.body3: float = 250000.5, int = 2000003
>  - for.inc4: float = 250000.5, int = 2000003
>  - for.end6: float = 1.0, int = 8
>
>
> But if I manually modify the frequencies of both to get close to MAX_INT32,
> the ratios between the frequencies do not reflect reality. For example, if I
> change branch_weights in both loops to be 4,294,967,295 and 2,147,483,647
>
> $ bin/opt -analyze -block-freq -S unbiased-branches.ll
> Printing analysis 'Block Frequency Analysis' for function 'bar':
> block-frequency-info: bar
>  - entry: float = 1.0, int = 8
>
> Printing analysis 'Block Frequency Analysis' for function 'main':
> block-frequency-info: main
>  - entry: float = 1.0, int = 8
>  - for.cond: float = 1073741824.4, int = 8589934595
>  - for.body: float = 1073741823.4, int = 8589934587
>  - for.inc: float = 1073741823.4, int = 8589934587
>  - for.end: float = 1.0, int = 8
>  - for.cond1: float = 1073741824.4, int = 8589934595
>  - for.body3: float = 1073741823.4, int = 8589934587
>  - for.inc4: float = 1073741823.4, int = 8589934587
>  - for.end6: float = 1.0, int = 8
>
> Now both loops are considered equally hot.
>
> Duncan, I think that if I were to make branch_weights a 64-bit integer, this
> would not be an issue. But I'm not sure if I'm not missing something else
> here.
>
>
> Thanks.  Diego.