[PATCH] D48967: [Dominators] Convert existing passes and utils to use the DomTreeUpdater class

Mon Jul 9 10:07:16 PDT 2018

NutshellySima added inline comments.

================
Comment at: lib/Transforms/Scalar/JumpThreading.cpp:420
   LoopHeaders.clear();
-  DDT->flush();
+  DTU->getDomTree();
   LVI->enableDT();
----------------
brzycki wrote:
> NutshellySima wrote:
> > brzycki wrote:
> > > It's not obvious here the reasoning behind this statement. I recommend either changing this to `DTU->flush()` or adding a comment stating the call will force all updates to the internal DT/PDT trees.
> > I think `DTU->flush()` should only be called when people need either
> > 1. Flush all deleteBB awaiting deletion.
> > 2. Do queries on both DT and PDT.
> > 
> > In the future, if DTU holds both DT and PDT in JumpThreading, call `DTU->getDomTree` which only flushes DT can pend updates on PDT as lazy as possible.
> I'm concerned the complexity of updates will become unmanageable if you decide to take this route.  Some questions I have:
> * How do you preserve the list of updates for one vs the other? Two separate queues?
> * How do you decide to delete unnecessary updates from one and not the other?
> * How do you manage the list of delted BBs? Are they only really deleted when both DT and PDT are flushed?
> * How do you make sure you don't actually delete a BB early?
> * How do you handle `Eager` vs `Lazy` updates?
> * How do you handle `*Relaxed()` updates?
> 
> The longer you avoid flushing the further your updates become from the real CFG's shape. Debugging an assert where DT and PDT update at different times means the developer has to also carefully keep the state of DTU in mind instead of just the IR and the trees. I'm not sure if the possible saved overhead of updates to one or the other tree is worth all that added complexity.
> How do you preserve the list of updates for one vs the other? Two separate queues?
There is only one queue and two index to record the updates that have been applied and not applied yet.

| Index | 0 |  1 |  2 | 3
| PendUpdates | o | o | o | o 
|DTUpdateIndex | | | ↑ | |
| PDTUpdateIndex |↑ | | | |

> How do you decide to delete unnecessary updates from one and not the other?
If an update is applied by all available trees, it will be purged.

> How do you manage the list of deleted BBs? Are they only really deleted when both DT and PDT are flushed?
Yes, more specifically `DTUpdateIndex == PDTUpdateIndex == PendUpdates.size()`. It's impossible to judge whether a specific DelBB is feasible to be deleted at a specific time point because some callers of delBB just
```
BasicBlock *BB0 = ...;
DTU.delBB (BB0);
// Then do some actions on BB0
... ...
DTU.flush(); // The caller expects BB0 be deleted now.
```
> How do you make sure you don't actually delete a BB early?
Mentioned above.
> How do you handle `Eager` vs `Lazy` updates?
`applyUpdates`: 
`Lazy/Eager+ForceRemoveDuplicates` - Works the same as DDT, more specifically
1. Deduplicate, which is the main functionality of DDT. 
2. Inspect whether an edge is presented in CFG and decide whether to discard it. It is a must because the update sequence is unbalanced.
`Eager` - Works the same as raw DT/PDT.
`insertEdge/deleteEdge`:
`Eager`: Works the same as raw DT/PDT and adds inspection whether the update is valid.
`Lazy`: then submits into the list of updates rather than apply it.
`recalculate`:
`Lazy`: `flush()` + raw `recalculate()`
`Eager`: raw `recalculate()`
> How do you handle `*Relaxed()` updates?
It behaves the same as the ones in DDT. It discards invalid updates quietly.

> The longer you avoid flushing the further your updates become from the real CFG's shape. Debugging an assert where DT and PDT update at different times means the developer has to also carefully keep the state of DTU in mind instead of just the IR and the trees. I'm not sure if the possible saved overhead of updates to one or the other tree is worth all that added complexity.

I don't think so. With current DDT, let me show an example
```
// Changing the state of CFG begins. 
... // Modify CFG
MergeBasicBlockIntoOnlyPred (..., &DDT); // Some error happens !
// (You cannot do verification after MergeBasicBlockIntoOnlyPred because the CFG and DT is unsynced)
... // Modify DomTree
... // Modify CFG
// Changing the state of CFG ends.
...
DDT.flush().verify(); // Asserts !
```
Now with DTU,
```
// Changing the state of CFG begins.
... // Modify CFG
MergeBasicBlockIntoOnlyPred (..., &DTU); // Some error happens !
//  (You cannot do verification after MergeBasicBlockIntoOnlyPred because the CFG and DT is unsynced)
... // Modify DomTree
... // Modify CFG
// Changing the state of CFG ends.
...
DTU.getDomTree().verify(); // Asserts !
```
I can't see the difference. They'll assert at the same location and all far away from where the bug happens.

https://reviews.llvm.org/D48967