[PATCH] D57881: [DTU] Refine the document of mutation APIs [NFC] (PR40528)

Tue Feb 12 21:41:33 PST 2019

mkazantsev added inline comments.

================
Comment at: llvm/include/llvm/Analysis/DomTreeUpdater.h:137
   /// CFG in DEBUG mode.
+  /// CAUTION! It is only legal to use this method when you have exactly one
+  /// insertion (and no deletions). This function has to be called *after*
----------------
NutshellySima wrote:
> mkazantsev wrote:
> > NutshellySima wrote:
> > > mkazantsev wrote:
> > > > That looks *highly* fishy. Between which two points we should have exactly one update?
> > > > 
> > > > For example, if we use eager strategy and we need to delete 10 edges and insert 10 other edges, is that guaranteed to work if we consecutively make 1 CFG update followed by 1 DT update, and do so for every edge? Can we interpret it as making exactly one update 20 times over?
> > > > 
> > > > Is that exactly one update between two flushes for lazy strategy?
> > > When using DTU Eager UpdateStrategy or using DT/PDT directly, updating a DominatorTree doesn't follow associative law of conjunction because the internal updating logic of DomTree will do a graph search on the corresponding CFG each time you call the mutation APIs.
> > > For example, 
> > > ```
> > > // Don't make any change to CFG
> > > // Considering Eager DomTree updates is idempotent
> > > DT.insertEdge(Foo, Bar); // Assert here if there is no Foo->Bar in the original CFG
> > > DT.deleteEdge(Foo, Bar); // Assert here if there is Foo->Bar in the original CFG
> > > ```
> > > Considering `insertEdge/deleteEdge` can bring confusion to devs, I'll consider remove those 2 APIs but
> > > ```
> > > // Don't make any change to CFG
> > > // Considering Eager DomTree updates is idempotent
> > > DT.applyUpdates({Insert,Foo, Bar}); // Assert here if there is no Foo->Bar in the original CFG
> > > DT.applyUpdates({Delete,Foo, Bar}); // Assert here if there is Foo->Bar in the original CFG
> > > ```
> > > is also not OK overall.
> > I mean the following scenario:
> >   1. Delete A->B in control flow
> >   2.  DTU.deleteEdge(A, B)
> >   3. Delete B->C in control flow
> >   4. DTU.deleteEdge(B, C)
> >   ...
> > 
> > I can have two different interpretations of that:
> > a) It is not supposed to work under any strategy because I have more than 1 deletion.
> > b) It is only supposed to work under Eager strategy because at every point I make exactly one CFG update followed by DT update, and it is not supposed to work under Lazy strategy because I have more than 1 deletion.
> > 
> > From this comment, I cannot figure out which interpretation is correct (if any).
> > 
> DTU can work in the scenario you mentioned.
> The precondition you call any mutation API (insertEdge/deleteEdge/applyUpdates) is that **the current CFG is the same as (the CFG which DTU is in sync+updates submitted).**
> So,
> ``` 
> // CFG state a), DTU is now in sync with CFG state a)
> 1. Delete A->B in control flow // CFG state b)
> 2. DTU.deleteEdge(A, B) // `CFG state b)` matches `CFG state a)-Edge(A, B)`, so it's legal. DTU is now in sync with CFG state b)
> 3. Delete B->C in control flow // CFG state c)
> 4. DTU.deleteEdge(B, C) // `CFG state c)` matches `CFG state b)-Edge(B, C)`, so it's legal. DTU is now in sync with CFG state c).
> ```
> 
> The above scenario works under **both Eager and Lazy modes**.
> 
> You can also submit updates under *only* Lazy mode like:
> 1. Delete A->B in control flow
> 2. Delete B->C in control flow
> 3. DTU.deleteEdge(A, B)
> 4. DTU.deleteEdge(B, C)
> But I don't think it's valuable to document this difference between Eager and Lazy mode because it will only  cause confusion to others. I believe the aim of the document of DTU is to make a uniform rule to use both Eager and Lazy modes to encourage us to write code that is fully compatible under both modes, i.e. hiding the difference between them.
This precondition you've mentioned should be met at *different* points under Lazy and Eager strategy. We DO NOT need this precondition after each insert/deleteEdge under Lazy strategy, we only need it when we `flush`. We DO need this precondition after each insert/deleteEdge under Eager strategy.

If I understand you correctly, there are actually two rules:
1. Under the Eager strategy, `flush` is done after every insert/deleteEdge. Under the Lazy strategy, it should be called explicitly.
2. By the moment when we make `flush`, the current CFG is the same as (the CFG which DTU is in sync+updates submitted).

This difference is important, and lack of it in documentation was the motivation for PR40528. I think it should be documented.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D57881/new/

https://reviews.llvm.org/D57881