[PATCH] D48383: [Dominators] Add the DomTreeUpdater class

Mon Jun 25 08:24:52 PDT 2018

brzycki added a comment.

In https://reviews.llvm.org/D48383#1140730, @kuhar wrote:

> In https://reviews.llvm.org/D48383#1140675, @dmgreen wrote:
>
> > Couple of quick, high level questions, in terms of how we expect the interface to be used. If there is a function that currently looks like this:
> >
> >   bool MergeBlockIntoPredecessor(BasicBlock *BB, DominatorTree *DT = nullptr,
> >                                  LoopInfo *LI = nullptr,
> >                                  MemoryDependenceResults *MemDep = nullptr,
> >                                  DeferredDominance *DDT = nullptr);
> >   
> >
> > .. and we'd like to make it optionally preserve PDT's, do you think it's best to have both an (optional) DT and a DTU? Just the DTU and get DT through it? Would the DTU remain optional or become required? (though possibly empty)
>
>
> The idea is to have it only take DTU (by reference (non-optional)), such that MergeBlockIntoPredecessor preserves whatever you pass it (possibly no tree).

+1. I'd like to see most, if not all, of the updates to PDT/DT to be done through the DTU interface `applyUpdate()` call. It simplifies method signatures and provides a consistent interface to analysis and updates.

>> What about something like
>> 
>>   BasicBlock *SplitBlockPredecessors(BasicBlock *BB, ArrayRef<BasicBlock *> Preds,
>>                                      const char *Suffix,
>>                                      DominatorTree *DT = nullptr,
>>                                      LoopInfo *LI = nullptr,
>>                                      bool PreserveLCSSA = false);
>> 
>> Which I think calls DT->splitBlock() internally. Do you think updating this just to take a DTU instead of a DT, and updating all the call sites is the best way to go?
> 
> In general. the preferred way would be take DTU as an argument and try to use the incremental api.
>  There are some cases where it seems like you can do some smarter things with more low-level operations, e.g. changing the function entry block, splitting predecessor, but from what I know there are only a couple of them. In such cases we can expose them as dedicated functions in DTU.

In this case someone will have to do the `flush()` call during the Lazy Strategy. Either the call of `SplitBlockPredecessors()` will have to do:

  auto DT = DTU->getDT(); // flush, deleteBBs called
  auto NewBB = SplitBlockPredecessors(BB, Preds, Suffix, DT);

or

  auto NewBB = SplitBlockPredecessors(BB, Preds, Suffix, DTU);
  ...
  // Inside SplitBlockPredecessors()
  auto DT = DTU->getDT();
  DT->splitblock();
  ...

The latter example has three benefits in my opinion:

1. the details of DT/DTU are hidden from the caller
2. SplitBlockPredecessors now preserves (and u[dates) PDT for free
3. There might also be a way to update the DT/PDT without the `flush()` call which can be investigated without altering all the callers of `SplitBlockPredecessors()`

================
Comment at: include/llvm/IR/DomTreeUpdater.h:142
+  /// It must only be called when it has a DomTree.
+  DominatorTree &flushDomTree();
+
----------------
NutshellySima wrote:
> NutshellySima wrote:
> > dmgreen wrote:
> > > brzycki wrote:
> > > > I'd actually prefer if flushDomTree() and flushPostDomTree(), and flushAll() were private members. Users shouldn't have to think about flushing state. Instead, have all delections and flushes happen when they request access to a DT or PDT reference such as:
> > > > 
> > > > ```
> > > > auto DTU = DomTreeUpdater(DT, PDT, Lazy);
> > > > ...
> > > > DTU->applyUpdates({Foo, Bar, Insert});
> > > > DTU->applyUpdates({Bar, Baz, Delete});
> > > > ...
> > > > auto DT = DTU->getDT(); // flush() and delete BBs happen here automatically.
> > > > ```
> > > What do you think of having DTU automatically flush remaining updates in it's deconstructor. So that things like these (to steal your examples) work:
> > >   void F(..)
> > >   {
> > >     auto DTU = DomTreeUpdater(DT, PDT, Lazy);
> > >     ...
> > >     DTU->applyUpdates({Foo, Bar, Insert});
> > >     DTU->applyUpdates({Bar, Baz, Delete});
> > >   }
> > > 
> > > or
> > > 
> > >   {
> > >     auto DTU = DomTreeUpdater(DT, PDT, Lazy);
> > >     SomeFunctionThatUpdatesTrees(..., DTU, ...);
> > >   }
> > I think flushAll() can't be private members:
> > 
> > ```
> > auto DTU = DomTreeUpdater(DT, Lazy);
> > ... // Make BB0 forward/backward-unreachable.
> > DTU->applyUpdates(SomeUpdates);
> > auto DT = DTU->getDT();
> > ...
> > // Now delete BB0
> > DTU->deleteBB(BB0);
> > // Now there is no pendingDTUpdates and can only use flushAll() to get BB0 removed from its parent and deleted.
> > ```
> More information on the above comments:
> I remember some code will break if the internal logic of deleteBB calls removeFromParent() and deletes BB immediately when there isn't any pending DT/PDT updates.
> So, I think it is needed to leave a flushAll() function to help user flush BB awaiting deletion.
That would be a good idea to force flushes (and deleteBB) on deconstruction iff any are pending.

================
Comment at: include/llvm/IR/DomTreeUpdater.h:142
+  /// It must only be called when it has a DomTree.
+  DominatorTree &flushDomTree();
+
----------------
brzycki wrote:
> NutshellySima wrote:
> > NutshellySima wrote:
> > > dmgreen wrote:
> > > > brzycki wrote:
> > > > > I'd actually prefer if flushDomTree() and flushPostDomTree(), and flushAll() were private members. Users shouldn't have to think about flushing state. Instead, have all delections and flushes happen when they request access to a DT or PDT reference such as:
> > > > > 
> > > > > ```
> > > > > auto DTU = DomTreeUpdater(DT, PDT, Lazy);
> > > > > ...
> > > > > DTU->applyUpdates({Foo, Bar, Insert});
> > > > > DTU->applyUpdates({Bar, Baz, Delete});
> > > > > ...
> > > > > auto DT = DTU->getDT(); // flush() and delete BBs happen here automatically.
> > > > > ```
> > > > What do you think of having DTU automatically flush remaining updates in it's deconstructor. So that things like these (to steal your examples) work:
> > > >   void F(..)
> > > >   {
> > > >     auto DTU = DomTreeUpdater(DT, PDT, Lazy);
> > > >     ...
> > > >     DTU->applyUpdates({Foo, Bar, Insert});
> > > >     DTU->applyUpdates({Bar, Baz, Delete});
> > > >   }
> > > > 
> > > > or
> > > > 
> > > >   {
> > > >     auto DTU = DomTreeUpdater(DT, PDT, Lazy);
> > > >     SomeFunctionThatUpdatesTrees(..., DTU, ...);
> > > >   }
> > > I think flushAll() can't be private members:
> > > 
> > > ```
> > > auto DTU = DomTreeUpdater(DT, Lazy);
> > > ... // Make BB0 forward/backward-unreachable.
> > > DTU->applyUpdates(SomeUpdates);
> > > auto DT = DTU->getDT();
> > > ...
> > > // Now delete BB0
> > > DTU->deleteBB(BB0);
> > > // Now there is no pendingDTUpdates and can only use flushAll() to get BB0 removed from its parent and deleted.
> > > ```
> > More information on the above comments:
> > I remember some code will break if the internal logic of deleteBB calls removeFromParent() and deletes BB immediately when there isn't any pending DT/PDT updates.
> > So, I think it is needed to leave a flushAll() function to help user flush BB awaiting deletion.
> That would be a good idea to force flushes (and deleteBB) on deconstruction iff any are pending.
You make a good point here, I think the big flush needs to be public. Can't it just be called `flush()` or `sync()` instead of `flushAll()`?  I'm still hopeful the need for this call can be minimized by the DT/PDT getter method implicitly flushing and more call-sites only use DTU.

https://reviews.llvm.org/D48383