[PATCH] Allow 0-weight branches in BranchProbabilityInfo.
Duncan P. N. Exon Smith
dexonsmith at apple.com
Wed May 6 10:09:21 PDT 2015
> On 2015 May 6, at 12:42, Diego Novillo <dnovillo at google.com> wrote:
>
> On Wed, May 6, 2015 at 12:41 PM, Hans Wennborg <hans at chromium.org> wrote:
>> ================
>> Comment at: lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp:8024
>> @@ -8013,1 +8023,3 @@
>> + Weight = std::max(
>> + 1u, BPI->getEdgeWeight(SI.getParent(), I.getSuccessorIndex()));
>> assert(Weight <= UINT32_MAX / SI.getNumSuccessors());
>> ----------------
>> This looks good. I can address this TODO later.
>>
>> I figured this would be the only change necessary for the switch lowering code. Are the changes to getEdgeWeight() and lowerWorkItem() also necessary?
>
> I don't know. I took the machine gun approach and changed all the
> places that called into BPI. I can remove the ones that don't cause
> the assertion in the tests, if you want.
>
>
> Diego.
Same comment as Hans, but I don't think the change the right one
for switch lowering:
> Index: lib/Analysis/BranchProbabilityInfo.cpp
> ===================================================================
> --- lib/Analysis/BranchProbabilityInfo.cpp
> +++ lib/Analysis/BranchProbabilityInfo.cpp
> @@ -201,8 +201,7 @@
> mdconst::dyn_extract<ConstantInt>(WeightsNode->getOperand(i));
> if (!Weight)
> return false;
> - Weights.push_back(
> - std::max<uint32_t>(1, Weight->getLimitedValue(WeightLimit)));
> + Weights.push_back(Weight->getLimitedValue(WeightLimit));
> }
> assert(Weights.size() == TI->getNumSuccessors() && "Checked above");
> for (unsigned i = 0, e = TI->getNumSuccessors(); i != e; ++i)
> @@ -553,7 +552,7 @@
> uint32_t PrevSum = Sum;
>
> Sum += Weight;
> - assert(Sum > PrevSum); (void) PrevSum;
> + assert(Sum >= PrevSum); (void) PrevSum;
> }
>
> return Sum;
> @@ -616,14 +615,17 @@
> uint32_t BranchProbabilityInfo::
> getEdgeWeight(const BasicBlock *Src, const BasicBlock *Dst) const {
> uint32_t Weight = 0;
> + bool FoundWeight = false;
> DenseMap<Edge, uint32_t>::const_iterator MapI;
> for (succ_const_iterator I = succ_begin(Src), E = succ_end(Src); I != E; ++I)
> if (*I == Dst) {
> MapI = Weights.find(std::make_pair(Src, I.getSuccessorIndex()));
> - if (MapI != Weights.end())
> + if (MapI != Weights.end()) {
> + FoundWeight = true;
> Weight += MapI->second;
> + }
> }
> - return (Weight == 0) ? DEFAULT_WEIGHT : Weight;
> + return (!FoundWeight) ? DEFAULT_WEIGHT : Weight;
> }
>
> /// Set the edge weight for a given edge specified by PredBlock and an index
> Index: lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
> ===================================================================
> --- lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
> +++ lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
> @@ -1319,7 +1319,10 @@
> return 0;
> const BasicBlock *SrcBB = Src->getBasicBlock();
> const BasicBlock *DstBB = Dst->getBasicBlock();
> - return BPI->getEdgeWeight(SrcBB, DstBB);
> + // TODO - BPI used to guarantee non-zero weights, but this produces
> + // information loss (see PR 22718). Since we can't handle zero weights here,
> + // use the same flooring mechanism previously used by BPI.
Why not?
> + return std::max(1u, BPI->getEdgeWeight(SrcBB, DstBB));
> }
>
> void SelectionDAGBuilder::
> @@ -7774,7 +7777,12 @@
> addSuccessorWithWeight(
> SwitchMBB, DefaultMBB,
> // The default destination is the first successor in IR.
> - BPI ? BPI->getEdgeWeight(SwitchMBB->getBasicBlock(), (unsigned)0)
> + // TODO - BPI used to guarantee non-zero weights, but this produces
> + // information loss (see PR 22718). Since we can't handle zero
> + // weights here, use the same flooring mechanism previously used by
> + // BPI.
> + BPI ? std::max(1u, BPI->getEdgeWeight(SwitchMBB->getBasicBlock(),
> + (unsigned)0))
> : 0);
Same here: why not? Shouldn't we fix that first?
>
> // Insert the true branch.
> @@ -8009,7 +8017,11 @@
> const ConstantInt *CaseVal = I.getCaseValue();
> uint32_t Weight = 1;
> if (BPI) {
> - Weight = BPI->getEdgeWeight(SI.getParent(), I.getSuccessorIndex());
> + // TODO - BPI used to guarantee non-zero weights, but this produces
> + // information loss (see PR 22718). Since we can't handle zero weights
> + // here, use the same flooring mechanism previously used by BPI.
> + Weight = std::max(
> + 1u, BPI->getEdgeWeight(SI.getParent(), I.getSuccessorIndex()));
This looks like `visitSwitch()`. I don't see why we can't handle a
weight of 0 here.
It would be easy to change `SelectionDAGBuilder::splitWorkItem` to be
resilient to weights of 0 instead:
diff --git a/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp b/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
index 5b6efe0..9b147cf 100644
--- a/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
+++ b/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
@@ -7990,23 +7990,33 @@ void SelectionDAGBuilder::splitWorkItem(SwitchWorkList &WorkList,
// Mehlhorn "Nearly Optimal Binary Search Trees" (1975).
CaseClusterIt LastLeft = W.FirstCluster;
CaseClusterIt FirstRight = W.LastCluster;
- uint32_t LeftWeight = LastLeft->Weight;
- uint32_t RightWeight = FirstRight->Weight;
+
+ auto getWorkingWeight = [](const CaseCluster &Cluster) {
+ // Zero-weight nodes would naturally cause skewed trees since they don't
+ // affect LeftWeight or RightWeight. Shift weights up 32-bits, and give
+ // zero-weights a value of 1.
+ uint64_t W = Cluster.Weight;
+ if (W)
+ return W << 32;
+ return 1;
+ };
+ uint64_t WorkingLeftWeight = getWorkingWeight(LastLeft);
+ uint64_t WorkingRightWeight = getWorkingWeight(FirstRight);
// Move LastLeft and FirstRight towards each other from opposite directions to
// find a partitioning of the clusters which balances the weight on both
// sides.
while (LastLeft + 1 < FirstRight) {
- // Zero-weight nodes would cause skewed trees since they don't affect
- // LeftWeight or RightWeight.
- assert(LastLeft->Weight != 0);
- assert(FirstRight->Weight != 0);
-
- if (LeftWeight < RightWeight)
- LeftWeight += (++LastLeft)->Weight;
+ if (WorkingLeftWeight < WorkingRightWeight)
+ WorkingLeftWeight += getWorkingWeight(++LastLeft);
else
- RightWeight += (--FirstRight)->Weight;
+ WorkingRightWeight += getWorkingWeight(--FirstRight);
}
+
+ // Scale back down to 32-bit numbers.
+ uint32_t LeftWeight = WorkingLeftWeight >> 32;
+ uint32_t RightWeight = WorkingRightWeight >> 32;
+
assert(LastLeft + 1 == FirstRight);
assert(LastLeft >= W.FirstCluster);
assert(FirstRight <= W.LastCluster);
> assert(Weight <= UINT32_MAX / SI.getNumSuccessors());
Hmm. I don't think BPI should guarantee this anymore. This is
unnecessary for keeping the sum under `UINT32_MAX`.
> }
> Clusters.push_back(CaseCluster::range(CaseVal, CaseVal, Succ, Weight));
More information about the llvm-commits
mailing list