[PATCH] D12603: Use fixed-point representation for BranchProbability

Fri Sep 11 17:34:08 PDT 2015

> On 2015-Sep-10, at 14:15, Xinliang David Li <davidxl at google.com> wrote:
> 
> Cong,  It is more common to use power of 2 as the scaling factor in
> fixed point representation for computation efficiency.  For 32bit
> width, 0x80000000 is common (noted as Q31).  Here are the benefits
> 
> 1) it is much faster to do the scale operation with Q31 -- instead of
> using the current slow version of scale, it can call a faster version
> to be introduced.
> 
> This will probably help resolve bug
> https://llvm.org/bugs/show_bug.cgi?id=24620 where 20% of the time is
> spent in the slow version of the scale method.

Is there a way to speed up scaling with UINT32_MAX?  I suspect so, but
haven't worked it out.  (E.g., you could scale to UINT64_C(1) << 32 and
ignore the first dither; if there's no dithering, subtract 1 from the
last.)

> 2) it allows negative fractional number to be represented

Not relevant for probabilities, though.

> 3) it is more precise for lots of common cases. For instance,
> representation of 0.5, 0.25, 0.75 (i.e. 1/2, 1/4, 3/4) etc are
> actually lossless with Q31 while the current choice of using
> uint32_max can not represent them precisely.

It's less precise for lots of common cases, too.  2^32's only prime
factor is 2, but 2^32-1 has 3, 5, 17, etc.

This argument seems weak either way.

> 
> (if needed in the future, we can also extend this to be a template
> class where number of bits in the scaling factor is a template
> parameter to support values > 1
>   template <int bits> class FixedPoint {
>     static const int ScalingFactor = 1 << bits;
> };
> )

Sure, applies to either.

> 
> In the IR dump, I find it is easier to read
> 1) percentage,
> 2) M/N form when N is less than 10.

I think we should always print (and check) for a percentage, but if you're
debugging something, you need to see exactly what's in memory, and the
hex is precise and easy to reason about.

> 
> Current dump of 0x10ffff/0xffffffff form is hard to read -- can then
> be simplified to make the denominator < 10?
> 
> 
> Duncan, what is your opinion?

If I'm going to be tracking down some bug by looking at DEBUG prints, I
suspect I'll find 0xffffffff easier to reason about than 0x80000000.  So
if we can keep UINT32_MAX, I'd prefer that.

> 
> thanks,
> 
> David
> 
> 
> 
> 
> 
> On Thu, Sep 10, 2015 at 12:21 PM, Cong Hou <congh at google.com> wrote:
>> congh updated this revision to Diff 34474.
>> congh added a comment.
>> 
>> Update the patch by fixing block frequency / branch probability unit tests failures.
>> 
>> I am not sure if this is the right way to update the unit tests. The fixed-point representation has precision issues comparing to the previous rational representation and that is why we need to update the comparison right answers in unit tests.
>> 
>> 
>> http://reviews.llvm.org/D12603
>> 
>> Files:
>>  include/llvm/Analysis/BlockFrequencyInfoImpl.h
>>  include/llvm/Support/BranchProbability.h
>>  lib/Support/BranchProbability.cpp
>>  lib/Target/ARM/ARMBaseInstrInfo.cpp
>>  test/Analysis/BlockFrequencyInfo/basic.ll
>>  test/Analysis/BlockFrequencyInfo/loops_with_profile_info.ll
>>  test/Analysis/BranchProbabilityInfo/basic.ll
>>  test/Analysis/BranchProbabilityInfo/loop.ll
>>  test/Analysis/BranchProbabilityInfo/noreturn.ll
>>  test/Analysis/BranchProbabilityInfo/pr18705.ll
>>  test/Analysis/BranchProbabilityInfo/pr22718.ll
>>  test/CodeGen/AArch64/fast-isel-branch-cond-split.ll
>>  test/CodeGen/ARM/2013-10-11-select-stalls.ll
>>  test/CodeGen/ARM/ifcvt4.ll
>>  test/CodeGen/ARM/sjlj-prepare-critical-edge.ll
>>  test/CodeGen/ARM/tail-merge-branch-weight.ll
>>  test/CodeGen/ARM/test-sharedidx.ll
>>  test/CodeGen/Thumb2/thumb2-ifcvt1.ll
>>  test/Transforms/SampleProfile/branch.ll
>>  test/Transforms/SampleProfile/calls.ll
>>  test/Transforms/SampleProfile/discriminator.ll
>>  test/Transforms/SampleProfile/fnptr.ll
>>  test/Transforms/SampleProfile/propagate.ll
>>  unittests/Support/BlockFrequencyTest.cpp
>>  unittests/Support/BranchProbabilityTest.cpp
>>