[PATCH] D53245: Teach the DominatorTree fallback to recalculation when applying updates to speedup JT (PR37929)

Mon Oct 15 07:07:46 PDT 2018

kuhar added a comment.

In https://reviews.llvm.org/D53245#1265080, @NutshellySima wrote:

> In https://reviews.llvm.org/D53245#1264864, @dmgreen wrote:
>
> > Interesting. Nice improvements. What about small trees? It would seem that any tree less that 75 nodes would always be recalculated. Do the timings you ran include things to show that this is better? Or was that just looking at larger trees at the time?
>
>
> I haven't conducted dedicated tests on graphs with the number of vertices less than 75, but among all the domtrees recalculated in the two `clang-*.bc` I mentioned above, there are approximately 63% of them has an `n` smaller than 75.

I'd be much more comfortable if this change didn't always recalculate for graphs smaller than 75. I think there is a lot of value in making sure the incremental updater is behaving correctly on small examples (e.g., existing unit tests, most regression tests, etc.). If for some reason it stops working, it will be much easier to detect and fix it on these small examples.

> In fact, the time used by directly applying updates to the domtree also depends on whether the graph is sparse or dense.  It is hard to generate accurate heuristic to estimate when to recalculate the domtree or not. Because edge insertions have a complexity of `O(m*min{n,k}+nk)`, edge deletions have a complexity of `O(m*k)` and recalculation has a complexity of `O(n^2)`. (Let `n` denote number of vertices of the graph/`m` be the number of edges after updates/`k` be the number of updates). I think if we want to fine-tune this heuristic, we need to collect a lot of statistics and mathematically fit the function (n,m,k)->bool (whether to recalculate or not).

Since we cannot measure running time directly in any realistic scenario, I think that the best way to approach this would be to count the number of graph nodes visited during DFS walks and see how many of them we get with different heuristics.
I think that the bugs already established that we need to have some fallback heuristic. I'd suggest being careful here and trying at least a couple of different policies on different projects (maybe clang, sqlite, the lnt test suite). There are a couple of properties I propose for the heuristic:

1. Don't always recalculate small trees (say, n < 100)
2. Recalculate when k > n
3. Recalculate in the case of the performance bugs we saw
4. Visit fewer nodes than the current updater for Clang + SQLite + Oggenc + ...

> @kuhar , do you think we need to use another heuristic involving `m`? Though I think maybe using only `n` is sufficient? :)

Intuitively, I think that the heuristic can be some linear function: a * n + b * k > threshold, where a and b are coefficients and the threshold is a constant. I don't think it's possible to calculate m efficiently, is it?

https://reviews.llvm.org/D53245