[PATCH] D13363: [DeadStoreElimination] Add support for non-local DSE
Ivan Baev via llvm-commits
llvm-commits at lists.llvm.org
Wed Feb 3 19:22:25 PST 2016
ivanbaev added a comment.
Hi Mitch,
Thanks for working on this.
In http://reviews.llvm.org/D13363#334021, @mbodart wrote:
> I have a resolution for the issue causing bot timeouts,
> as well as one unrelated fix for irreducible control flow.
> Fortunately both fixes are trivial. Details below.
>
> Chad or Ivan, how do you want to proceed?
> Can you apply these fixes and re-land the change?
Feel free to proceed as Chad suggested.
> I used Rafael's llvm-ar.ll reproducer at
> http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151207/318782.html
> to chase the bot timeouts.
>
> The issue is in findSafePreds, which was getting tripped up by a switch
> statement, and inadvertently adding the same block to the worklist
> multiple times. The predecessor iterator needs to avoid processing
> a given predecessor more than once. Keeping track with a "PredsVisited"
> set resolves that, and there is no longer any noticeable compile time
> difference w/wo nonlocal DSE for llvm-ar.ll.
Is it something like:
SmallPtrSet<BasicBlock *, 8> Visited;
for (auto I = pred_begin(BB), E = pred_end(BB); I != E; ++I) {
if (!Visited.insert(*I).second)
continue;
> Wrt irreducibility, I was able to construct a test case for which the
> CFG search goes into a cycle and revisits the original safe block
> that instigated the nonlocal DSE search, SI->getParent() in
> DSE::handleNonLocalDependency(). My earlier suggestion for the search
> routine had an explicit check for !SafeBlocks.count(P) to avoid this
> situation, but that was elided in favor of the dominator check
> DT->dominates(BB, Pred). With irreducible control flow we can have
> a case where Pred is the original safe block, but BB does not dominate it.
>
> One fix here is to pass in that original block to findSafePreds,
> and replace the dominator check with DT->dominates(OrigBB, Pred).
> While fairly subtle, this prevents processing any block more than once,
> and is still sufficient to avoid problems with the "non loop aware"
> dependence analysis.
Do you propose replacing DT->dominates(BB, Pred) in
if (!DT->isReachableFromEntry(Pred) || DT->dominates(BB, Pred))
with DT->dominates(OrigBB, Pred), or adding the latter as a third check?
It would be great to have this irreducible CFG test added to the existing tests for DSE.
> One other nit: The SafeBlocks set is unordered, and only needed for
> membership queries. Should it be declared as a SmallSet instead of
> a SmallSetVector?
That's fine. We actually need a SmallSetVector with reverse insertion order iteration characteristics.
> Finally, I collected some stats when compiling llvm-ar.ll.
>
> There are 23138 calls to handleNonLocalDependency which
> reach the worklist processing, with 84 dead instructions found.
> 94% of the time we process 5 or fewer predecessor blocks.
> In the worst case we looked at 266 predecessors.
>
> Blocks Frequency
> 0 11553 50%
> 1 5047 22%
> 2 3858 17%
> 3 778 3%
> 4 285
> 5 221
> 6 148
> 7 112
> 8 193
> 9 57
> 10 53
> 11-50 410
> 51-100 36
> 100-266 7
>
> In one instance we found 5 dead instructions by searching 22 blocks.
> I didn't dig into whether those could have been found in fewer blocks.
> Searching more than 22 blocks was never fruitful for llvm-ar.ll.
>
> Though we would of course want more samples to make any decisions,
> limiting the number of worklist blocks processed is another
> candidate for a compile time threshold, should one be needed.
> OTOH, it may only matter in pathalogical cases.
Interesting data; it may vary from a source language to a source language. I agree that we need more samples to justify adding such heuristic. For now, I propose:
static const unsigned MaxNonLocalAttempts = 10;
Ivan
Repository:
rL LLVM
http://reviews.llvm.org/D13363
More information about the llvm-commits
mailing list