[PATCH] D10979: Update the branch weight metadata in JumpThreading pass.
Duncan P. N. Exon Smith
dexonsmith at apple.com
Tue Jul 7 16:53:26 PDT 2015
> On 2015-Jul-07, at 13:24, Xinliang David Li <davidxl at google.com> wrote:
>
> On Tue, Jul 7, 2015 at 10:39 AM, Duncan P. N. Exon Smith
> <dexonsmith at apple.com> wrote:
>>
>>> On 2015-Jul-06, at 17:22, Cong Hou <congh at google.com> wrote:
>>>
>>>
>>> congh added reviewers: chandlerc, davidxl.
>>> congh added a subscriber: llvm-commits.
>>>
>>> Currently in JumpThreading pass, the branch weight metadata is not updated after CFG modification. Consider the jump threading on PredBB, BB, and SuccBB. After jump threading, the weight on BB->SuccBB should be adjusted as some of it is contributed by the edge PredBB->BB, which doesn't exist anymore. This patch tries to update the edge weight in metadata on BB->SuccBB by scaling it by 1 - Freq(PredBB->BB) / Freq(BB->SuccBB). Two more analyses (BlockFrequencyInfo and BranchProbabilityInfo) are needed then.
>>
>> Thanks for working on this!
>>
>> Generally you don't need these analyses to keep branch weights
>> up-to-date. The design premise is that you don't need global
>> information for local updates.
>>
>
> This is true in general, but probably not the case for jump-threading.
> When a new thread is formed (from NewBB to SuccBB), the profile
> update delta to the original edge BB->SuccBB comes from a different
> edge (Pred->NewBB which is inherited from Pred->BB). Local update by
> only looking at BB seems impossible.
Ah, I see. Yes, you need a global view to know predecessor
probabilities.
>
>> You should be able to calculate the new !prof attachment based on the
>> old ones, without running BFI. (I'm skeptical of even running BPI -- if
>> there's no !prof attachment on the old block, then you have no real
>> information; what's the benefit in generating a new !prof attachment
>> based on heuristics? The new CFG will come with its own new
>> heuristics.)
>>
>
> I believe Cong's fix is targeting PGO.
Cong, I have a few concerns with the current approach.
Firstly, this requires BFI even when there is no profile data (as David
also noted in one of the Phabricator emails). I don't think it's
reasonable to require all users of jump threading to run an analysis
that will never be used. One option would be to `addOptional<>`, but
I'm not sure that'll get you what you want.
To prevent everyone paying for profile data, we should do the following:
if the basic block in question has branch weights, then (and only then)
retrieve the BFI analysis so they can be updated. I think you'll either
have to port the analysis retrieval and caching logic from the new
`PassManager` over to `LegacyPassManager`, or wait for it to be used in
tree.
(On a related point, I doubt users of PGO + jump threading with partial
profiles (e.g., JITs) care enough about fidelity to justify running BFI
for this edge. IMO, this update should be configurable.)
Secondly, the current code requires BFI and then immediately invalidates
it, but it should be straightforward to incrementally update (preserve)
BFI here.
Moreover, I think your patch is incorrect as written. The structure of
the algorithm is basically:
do {
Changed = false;
for (BasicBlock &BB : F)
while (ProcessBlock(BB))
Changed = true;
} while (Changed);
As soon as `ProcessBlock()` does something, the next call will have
stale BFI, won't it?
Thirdly (a performance note), I don't think it makes sense to recompute
BFI in `ProcessBlock()` (in the inner loop). Instead:
do {
Changed = false;
for (BasicBlock &BB : F) {
bool BlockChanged = false;
while (ProcessBlock(BB))
BlockChanged = true;
if (BlockChanged) {
Changed = true;
updateBranchWeights(BB);
}
}
} while (Changed);
This waits to update branch weights until all the relevant predecessors
have been threaded through.
More information about the llvm-commits
mailing list