[PATCH] PGO: scale counters to 32-bits
Justin Bogner
mail at justinbogner.com
Mon Mar 10 23:00:13 PDT 2014
"Duncan P. N. Exon Smith" <dexonsmith at apple.com> writes:
> PGO counters are 64-bit and branch weights are 32-bit. This patch
> updates the code to scale them down when necessary, instead of just
> using the lower 32 bits.
The code LGTM. I have some comments about your comments :)
> diff --git a/lib/CodeGen/CodeGenPGO.cpp b/lib/CodeGen/CodeGenPGO.cpp
> index cdd411c..29af47b 100644
> --- a/lib/CodeGen/CodeGenPGO.cpp
> +++ b/lib/CodeGen/CodeGenPGO.cpp
> @@ -872,29 +872,49 @@ void CodeGenPGO::destroyRegionCounters() {
> delete RegionCounts;
> }
>
> +static uint64_t calculateWeightScale(uint64_t MaxWeight) {
> + // Divide by UINT32_MAX, then add 1 to intentionally over-estimate. If we
> + // don't over-estimate the scale, adding 1 in scaleBranchWeight() could
> + // overflow.
I'd make this a doc comment, even though I don't think those show up
anywhere for static functions.
Also, the wording is a bit confusing - You could probably just explain
that this scales down to (2^32) - 1 and be done with it. No need to
mention division and addition.
> + return MaxWeight < UINT32_MAX ? 1 : MaxWeight / UINT32_MAX + 1;
> +}
> +static uint32_t scaleBranchWeight(uint64_t Weight, uint64_t Scale) {
> + // According to Laplace's Rule of Succession, it is better to compute the
> + // weight based on the count plus 1.
This could be another doc comment with some minor rewording.
> + assert(Scale && "scale by 0?");
> + uint64_t Scaled = Weight / Scale + 1;
> + assert(Scaled <= UINT32_MAX);
A message would be nice in this assert.
> + return Scaled;
> +}
> +
> llvm::MDNode *CodeGenPGO::createBranchWeights(uint64_t TrueCount,
> uint64_t FalseCount) {
> if (!TrueCount && !FalseCount)
> return 0;
>
> + // Calculate how to scale down to 32-bits.
> + uint64_t Scale = calculateWeightScale(std::max(TrueCount, FalseCount));
> +
> llvm::MDBuilder MDHelper(CGM.getLLVMContext());
> - // TODO: need to scale down to 32-bits
> - // According to Laplace's Rule of Succession, it is better to compute the
> - // weight based on the count plus 1.
> - return MDHelper.createBranchWeights(TrueCount + 1, FalseCount + 1);
> + return MDHelper.createBranchWeights(scaleBranchWeight(TrueCount, Scale),
> + scaleBranchWeight(FalseCount, Scale));
> }
>
> llvm::MDNode *CodeGenPGO::createBranchWeights(ArrayRef<uint64_t> Weights) {
> - llvm::MDBuilder MDHelper(CGM.getLLVMContext());
> - // TODO: need to scale down to 32-bits, instead of just truncating.
> - // According to Laplace's Rule of Succession, it is better to compute the
> - // weight based on the count plus 1.
> + // The weight is irrelevant if there are fewer than two.
"We need at least two elements to create meaningful weights"?
> + if (Weights.size() < 2)
> + return 0;
> +
> + // Calculate how to scale down to 32-bits.
> + uint64_t Scale = calculateWeightScale(*std::max_element(Weights.begin(),
> + Weights.end()));
> +
> SmallVector<uint32_t, 16> ScaledWeights;
> ScaledWeights.reserve(Weights.size());
> - for (ArrayRef<uint64_t>::iterator WI = Weights.begin(), WE = Weights.end();
> - WI != WE; ++WI) {
> - ScaledWeights.push_back(*WI + 1);
> - }
> + for (uint64_t W : Weights)
> + ScaledWeights.push_back(scaleBranchWeight(W, Scale));
> +
> + llvm::MDBuilder MDHelper(CGM.getLLVMContext());
> return MDHelper.createBranchWeights(ScaledWeights);
> }
>
> diff --git a/test/Profile/Inputs/c-counter-overflows.profdata b/test/Profile/Inputs/c-counter-overflows.profdata
> new file mode 100644
> index 0000000..377a085
> --- /dev/null
> +++ b/test/Profile/Inputs/c-counter-overflows.profdata
> @@ -0,0 +1,10 @@
> +main 8
> +1
> +68719476720
> +64424509425
> +68719476720
> +21474836475
> +21474836475
> +21474836475
> +4294967295
> +
> diff --git a/test/Profile/c-counter-overflows.c b/test/Profile/c-counter-overflows.c
> new file mode 100644
> index 0000000..ddbe6d1
> --- /dev/null
> +++ b/test/Profile/c-counter-overflows.c
> @@ -0,0 +1,49 @@
> +// Test that big branch weights get scaled down to 32-bits, rather than just
> +// truncated.
> +
> +// RUN: %clang_cc1 -triple x86_64-apple-macosx10.9 -main-file-name c-counter-overflows.c %s -o - -emit-llvm -fprofile-instr-use=%S/Inputs/c-counter-overflows.profdata | FileCheck %s
> +
> +#include <stdint.h>
> +
> +// PGOGEN: @[[MAIN:__llvm_pgo_ctr[0-9]*]] = private global [2 x i64] zeroinitializer
> +int main(int argc, const char *argv[]) {
> + // Need counts higher than 32-bits.
> + // CHECK: br {{.*}} !prof ![[FOR:[0-9]+]]
> + // max = 0xffffffff0
> + // scale = 0xffffffff0 / 0xffffffff + 1 = 17
> + // loop-body: 0xffffffff0 / 17 + 1 = 0xf0f0f0f0 + 1 = 4042322161 => -252645135
> + // loop-exit: 0x000000001 / 17 + 1 = 0x00000000 + 1 = 1 => 1
> + for (uint64_t I = 0; I < 0xffffffff0; ++I) {
> + // max = 0xffffffff * 15 = 0xefffffff1
> + // scale = 0xefffffff1 / 0xffffffff + 1 = 16
> + // CHECK: br {{.*}} !prof ![[IF:[0-9]+]]
> + if (I & 0xf) {
> + // 0xefffffff1 / 16 + 1 = 0xefffffff + 1 = 4026531840 => -268435456
> + } else {
> + // 0x0ffffffff / 16 + 1 = 0x0fffffff + 1 = 268435456 => 268435456
> + }
> +
> + // max = 0xffffffff * 5 = 0x4fffffffb
> + // scale = 0x4fffffffb / 0xffffffff + 1 = 6
> + // CHECK: ], !prof ![[SWITCH:[0-9]+]]
> + switch ((I & 0xf) / 5) {
> + case 0:
> + // 0x4fffffffb / 6 = 0xd5555554 + 1 = 3579139413 => -715827883
> + break;
> + case 1:
> + // 0x4fffffffb / 6 = 0xd5555554 + 1 = 3579139413 => -715827883
> + break;
> + case 2:
> + // 0x4fffffffb / 6 = 0xd5555554 + 1 = 3579139413 => -715827883
> + break;
> + default:
> + // 0x0ffffffff / 6 = 0x2aaaaaaa + 1 = 715827883 => 715827883
> + break;
> + }
> + }
> + return 0;
> +}
> +
> +// CHECK-DAG: ![[FOR]] = metadata !{metadata !"branch_weights", i32 -252645135, i32 1}
> +// CHECK-DAG: ![[IF]] = metadata !{metadata !"branch_weights", i32 -268435456, i32 268435456}
> +// CHECK-DAG: ![[SWITCH]] = metadata !{metadata !"branch_weights", i32 715827883, i32 -715827883, i32 -715827883, i32 -715827883}
> _______________________________________________
> cfe-commits mailing list
> cfe-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits
More information about the cfe-commits
mailing list