[PATCH] D12603: Use fixed-point representation for BranchProbability
Duncan P. N. Exon Smith via llvm-commits
llvm-commits at lists.llvm.org
Tue Sep 8 14:42:26 PDT 2015
> On 2015-Sep-08, at 10:53, Cong Hou <congh at google.com> wrote:
>
> On Tue, Sep 8, 2015 at 9:39 AM, Duncan P. N. Exon Smith <dexonsmith at apple.com> wrote:
>
> > On 2015-Sep-05, at 22:41, David Li <davidxl at google.com> wrote:
> >
> >
> > ================
> > Comment at: test/Analysis/BlockFrequencyInfo/basic.ll:71
> > @@ -70,3 +70,3 @@
> > ; The 'case_c' branch is predicted more likely via branch weight metadata.
> > -; CHECK-NEXT: case_c: float = 0.8,
> > +; CHECK-NEXT: case_c: float = 0.8
> > case_c:
> > ----------------
> > where does the coma come from?
>
> The comma was there to confirm that we get *exactly* 0.8. Why has
> that changed?
>
> With fixed point representation, we won't get exactly 0.8 here but 0.8000000001. Should I restrict the output precision when printing this value, or use regex or exact 0.8000000001 in the test case?
I don't like that we don't get exactly 0.8 here, and frankly I don't
quite understand it. It's certainly possible to get exactly 0.8:
--
$ bc
scale=100
3435973836/4294967295
.8000000000000000000000000000000000000000000000000000000000000000000\
000000000000000000000000000000000
--
What's going wrong?
(If it really is unavoidable, I think restricting output precision makes
the most sense, so we get nice predictable output. But I want to be sure
it's actually necessary, first.)
I have a few more comments on the patch below.
> Index: test/Analysis/BlockFrequencyInfo/bad_input.ll
> ===================================================================
> --- test/Analysis/BlockFrequencyInfo/bad_input.ll
> +++ test/Analysis/BlockFrequencyInfo/bad_input.ll
> @@ -10,7 +10,7 @@
> br label %for.body
>
> ; Check that we get 1,4 instead of 0,3.
> -; CHECK-NEXT: for.body: float = 4.0,
> +; CHECK-NEXT: for.body: float = 3.99
> for.body:
> %i = phi i32 [ 0, %entry ], [ %inc, %for.body ]
> call void @g(i32 %i)
> @@ -43,7 +43,7 @@
>
> ; Check that the exit weight is half of entry, since half is lost in the
> ; infinite loop above.
> -; CHECK-NEXT: for.end: float = 0.5,
> +; CHECK-NEXT: for.end: float = 0.49
All these testcases being slightly off is suspicous to me. 32 bits
ought to be enough to represent these closely.
> for.end:
> ret void
> }
> Index: test/Transforms/SampleProfile/propagate.ll
> ===================================================================
> --- test/Transforms/SampleProfile/propagate.ll
> +++ test/Transforms/SampleProfile/propagate.ll
> @@ -97,8 +97,8 @@
> %div4 = sdiv i64 %10, 4, !dbg !22
> %cmp5 = icmp sgt i64 %9, %div4, !dbg !22
> br i1 %cmp5, label %if.then6, label %if.else7, !dbg !22
> -; CHECK: edge if.end -> if.then6 probability is 3 / 6342 = 0.0473037%
> -; CHECK: edge if.end -> if.else7 probability is 6339 / 6342 = 99.9527% [HOT edge]
> +; CHECK: edge if.end -> if.then6 probability is 0%
This is a fairly lossy change. We've lost all the significant digits
of 0.0473037%, and made it look like 1:0. I hope that's unnecessary.
> +; CHECK: edge if.end -> if.else7 probability is 100%
>
> if.then6: ; preds = %if.end
> %11 = load i32, i32* %y.addr, align 4, !dbg !24
> Index: lib/Support/BranchProbability.cpp
> ===================================================================
> --- lib/Support/BranchProbability.cpp
> +++ lib/Support/BranchProbability.cpp
> @@ -15,16 +15,42 @@
> #include "llvm/Support/Debug.h"
> #include "llvm/Support/Format.h"
> #include "llvm/Support/raw_ostream.h"
> +#include <cassert>
>
> using namespace llvm;
>
> raw_ostream &BranchProbability::print(raw_ostream &OS) const {
> - return OS << N << " / " << D << " = "
This was nice because it gave us something exact for debugging.
I suggest you maintain that, by printing `N` in hexidecimal with padding
(so that things line up nicely):
--
0x00000000
0x023ad039
0xffffffff
--
This is similar to how BlockMass is printed.
> - << format("%g%%", ((double)N / D) * 100.0);
> + return OS << format("%d%%",
> + static_cast<int>(std::round(((double)N / D) * 100.0)));
Why change the way these are rounded and printed?
> }
>
> void BranchProbability::dump() const { print(dbgs()) << '\n'; }
>
> +BranchProbability::BranchProbability(uint32_t n, uint32_t d) {
> + assert(d > 0 && "Denominator cannot be 0!");
> + assert(n <= d && "Probability cannot be bigger than 1!");
> + if (d == D)
> + N = n;
> + else {
> + uint64_t Prob64 = n * static_cast<uint64_t>(D) / d;
> + N = static_cast<uint32_t>(Prob64);
It doesn't look like this is rounding correctly. Can you add some
unit tests?
> + }
> +}
> +
> +BranchProbability &BranchProbability::operator+=(BranchProbability RHS) {
> + assert(N <= D - RHS.N &&
> + "The sum of branch probabilities should not exceed one!");
> + N += RHS.N;
> + return *this;
> +}
> +
> +BranchProbability &BranchProbability::operator-=(BranchProbability RHS) {
> + assert(N >= RHS.N &&
> + "Can only subtract a smaller probability from a larger one!");
> + N -= RHS.N;
> + return *this;
> +}
> +
> static uint64_t scale(uint64_t Num, uint32_t N, uint32_t D) {
> assert(D && "divide by 0");
>
> Index: include/llvm/Support/BranchProbability.h
> ===================================================================
> --- include/llvm/Support/BranchProbability.h
> +++ include/llvm/Support/BranchProbability.h
> @@ -15,36 +15,44 @@
> #define LLVM_SUPPORT_BRANCHPROBABILITY_H
>
> #include "llvm/Support/DataTypes.h"
> -#include <cassert>
>
> namespace llvm {
>
> class raw_ostream;
>
> -// This class represents Branch Probability as a non-negative fraction.
> +// This class represents Branch Probability as a non-negative fraction that is
> +// no greater than 1. It uses a fixed-point-like implementation, in which the
> +// denominator is always a constant value (here we use UINT32_MAX for maximum
> +// precision).
> class BranchProbability {
> // Numerator
> uint32_t N;
>
> - // Denominator
> - uint32_t D;
> + // Denominator, which is a constant value.
> + static const uint32_t D = UINT32_MAX;
> +
> + // Construct a BranchProbability with only numerator assuming the denominator
> + // is UINT32_MAX. For internal use only.
> + explicit BranchProbability(uint32_t n) : N(n) {}
It seems nice to expose this functionality publicly without having to
pass in `UINT32_MAX` to the denominator.
--
static BranchProbability getRaw(uint32_t N)
--
Or maybe you can come up with a better name than `getRaw()`.
>
> public:
> - BranchProbability(uint32_t n, uint32_t d) : N(n), D(d) {
> - assert(d > 0 && "Denominator cannot be 0!");
> - assert(n <= d && "Probability cannot be bigger than 1!");
> - }
> + BranchProbability() : N(0) {}
> + BranchProbability(uint32_t n, uint32_t d);
Can you fix the style here in a separate NFC commit? The parameters
should really be 'N' and 'D'.
> +
> + bool isZero() const { return N == 0; }
>
> - static BranchProbability getZero() { return BranchProbability(0, 1); }
> - static BranchProbability getOne() { return BranchProbability(1, 1); }
> + static BranchProbability getZero() { return BranchProbability(0); }
> + static BranchProbability getOne() { return BranchProbability(D); }
> +
> + // Normalize given probabilties so that the sum of them becomes one.
> + template <class ProbabilityList>
> + static void normalizeProbabilities(ProbabilityList &Probs);
>
> uint32_t getNumerator() const { return N; }
> - uint32_t getDenominator() const { return D; }
> + static uint32_t getDenominator() { return D; }
>
> // Return (1 - Probability).
> - BranchProbability getCompl() const {
> - return BranchProbability(D - N, D);
> - }
> + BranchProbability getCompl() const { return BranchProbability(D - N); }
>
> raw_ostream &print(raw_ostream &OS) const;
>
> @@ -66,24 +74,67 @@
> /// \return \c Num divided by \c this.
> uint64_t scaleByInverse(uint64_t Num) const;
>
> - bool operator==(BranchProbability RHS) const {
> - return (uint64_t)N * RHS.D == (uint64_t)D * RHS.N;
> + BranchProbability &operator+=(BranchProbability RHS);
> + BranchProbability &operator-=(BranchProbability RHS);
> + BranchProbability &operator*=(BranchProbability RHS) {
> + N = static_cast<uint64_t>(N) * RHS.N / D;
> + return *this;
> }
> - bool operator!=(BranchProbability RHS) const {
> - return !(*this == RHS);
> +
> + BranchProbability operator+(BranchProbability RHS) const {
> + BranchProbability Prob(*this);
> + return Prob += RHS;
> }
> - bool operator<(BranchProbability RHS) const {
> - return (uint64_t)N * RHS.D < (uint64_t)D * RHS.N;
> +
> + BranchProbability operator-(BranchProbability RHS) const {
> + BranchProbability Prob(*this);
> + return Prob -= RHS;
> }
> +
> + BranchProbability operator*(BranchProbability RHS) const {
> + BranchProbability Prob(*this);
> + return Prob *= RHS;
> + }
> +
> + bool operator==(BranchProbability RHS) const { return N == RHS.N; }
> + bool operator!=(BranchProbability RHS) const { return !(*this == RHS); }
> + bool operator<(BranchProbability RHS) const { return N < RHS.N; }
> bool operator>(BranchProbability RHS) const { return RHS < *this; }
> bool operator<=(BranchProbability RHS) const { return !(RHS < *this); }
> bool operator>=(BranchProbability RHS) const { return !(*this < RHS); }
> +
> + friend BranchProbability operator*(BranchProbability LHS, uint32_t RHS);
> + friend BranchProbability operator/(BranchProbability LHS, uint32_t RHS);
Why do these need to be friends? Can't they reach in and access the
numerator via `getNumerator()`?
> };
>
> -inline raw_ostream &operator<<(raw_ostream &OS, const BranchProbability &Prob) {
> +inline raw_ostream &operator<<(raw_ostream &OS, BranchProbability Prob) {
Why change this? If you really prefer passing by-value, please do it in
a separate NFC commit so it doesn't add noise to this patch.
> return Prob.print(OS);
> }
>
> +inline BranchProbability operator*(BranchProbability LHS, uint32_t RHS) {
> + LHS.N *= RHS;
Why do we need these? They seem error-prone. It's also new API we
didn't need before, so I'm a little skeptical...
What's the story for overflow? I imagine we'd want saturation
behaviour, or, possibly more robust, an assertion.
> + return LHS;
> +}
> +
> +inline BranchProbability operator/(BranchProbability LHS, uint32_t RHS) {
> + LHS.N /= RHS;
Assertion about 0? Rounding? (Do we even need this?)
> + return LHS;
> +}
> +
> +inline BranchProbability operator*(uint32_t LHS, BranchProbability RHS) {
> + return RHS * LHS;
> +}
> +
> +template <class ProbabilityList>
> +void BranchProbability::normalizeProbabilities(ProbabilityList &Probs) {
> + uint64_t Sum = 0;
> + for (auto Prob : Probs)
> + Sum += Prob.N;
> + if (Sum > 0)
Early return instead of branching?
Actually, this looks like bad input if `Sum` is 0. Should this just
assert?
> + for (auto &Prob : Probs)
> + Prob.N = Prob.N * uint64_t(D) / Sum;
I don't think these are guaranteed to equal `UINT32_MAX`, since there
can be rounding errors. You need to dither somehow. Can you add some
unit tests and fix the bugs?
> +}
> +
> }
>
> #endif
>
More information about the llvm-commits
mailing list