[PATCH] D57881: [DTU] Refine the document of mutation APIs [NFC] (PR40528)

Mon Feb 11 09:00:01 PST 2019

brzycki added a comment.

Hello @mkazantsev.

> That looks *highly* fishy. Between which two points [of `insertEdge()`] we should have exactly one update?

If you are using the `Eager` strategy these updates will be immediately applied to the `DT` and `PDT` internal to the `DTU` class. As the comment states there are internal checks of the CFG to determine if the insertion is valid. Because `DT`, `PDT`, and CFG are three distinct and separate structures we have to treat the CFG as the canonical shape of the IR. In `Lazy` strategy mode the update is simply queued until the next flush event.

> For example, if we use eager strategy and we need to delete 10 edges and insert 10 other edges, is that guaranteed to work if we consecutively make 1 CFG update followed by 1 DT update, and do so for every edge? Can we interpret it as making exactly one update 20 times over?

Yes, **if and only if** the state of the CFG is considered valid after each single deletion and insertion event (see number 3 below). For most cases it's recommended to use the `applyUpdates()` method in this case for several reasons:

1. There may be updates that cancel each other out `{Insert, A, B} ... {Delete, A, B}`. It is much more efficient to detect these before altering the `DT` and `PDT` as a `no-op`.
2. It's often the case at the call-site the code iterates over dozens of edges and needs to alter several of them in the CFG. Because the `DTU` class requires a stable CFG state to check valid insertions and deletions it's often better to send the updates to `DTU` after all the changes are done.
3. There are cases when it's impossible to make the CFG correct for a single edge change. There are often cases where 2-4 edges have to be altered before the CFG is considered valid. In this case you'll actually cause a `DTU` assert if you attempt to perform `insertEdge()` for every edge because it will fail when attempting to check the insertion in the CFG.
4. Single calls to `insertEdge()` and `deleteEdge()` incur function call overhead and multiple checks that are aggregated in the case of a single call to `applyUpdates()`.

There are valid reasons to use `insertEdge()` and `deleteEdge()`. The primary reason is the caller needs to only make exactly one change to `DTU`. The call to `applyUpdates()` has a bit of extra overhead to sanitize the update list and only makes sense if there are 2 or more updates.

> Is that exactly one update between two flushes for lazy strategy?

I'm not sure how to answer this question. The idea of exactly one update for `Lazy` strategy is much more difficult to determine. The update may be a `no-op`. It may also be enqueued only to be removed by a later update before a flush event as a new `no-op` pair. The update might be invalid but `DTU` was called through one of the `*Relaxed()` methods.

I will say that if you do the following:

  DTU->flush();
  // Alter CFG to add a new edge A -> B
  DTU->insertEdge(A, B);
  DTU->flush();

Will perform exactly one update at the `DTU` level. This may cause two sub-updates if both `DT` and `PDT` are not `nullPtr`. Each of these sub-updates may incur dozens of edge changes in their respective `DominatorTree` graphs.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D57881/new/

https://reviews.llvm.org/D57881