[PATCH] D54730: [DomTree] Fix order of domtree updates in MergeBlockIntoPredecessor.

Chijun Sima via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Nov 20 10:38:59 PST 2018


NutshellySima accepted this revision.
NutshellySima added a comment.
This revision is now accepted and ready to land.

In https://reviews.llvm.org/D54730#1303575, @kuhar wrote:

> We experimented with different ordering of updates before, including scheduling deletions before insertions, but didn't discover any promising strategies. Because of that, updates are performed in the exact order they were scheduled (except optimizing out trivially redundant ones).
>
> The preferred API to use is currently DomTreeUpeater, but I don't think it would affect performance here.
>
> Any thoughts @NutshellySima?


I have seen similar results that scheduling edge insertions before deletions can sometimes be faster when I try to make `UpdateAnalysisInformation` in `BasicBlockUtils` to use the incremental updater recently. In the case of `UpdateAnalysisInformation`, the number of edge insertions and deletions is nearly the same.

If I schedule edge deletions before insertions, I get

  ===-------------------------------------------------------------------------===
                                DomTree Calculation
  ===-------------------------------------------------------------------------===
    Total Execution Time: 144.1258 seconds (144.0620 wall clock)
  
     ---User Time---   --System Time--   --User+System--   ---Wall Time---  --- Name ---
    90.6385 ( 63.9%)   1.4306 ( 63.0%)  92.0691 ( 63.9%)  92.0375 ( 63.9%)  delete-reachable -- DomTree
    45.4411 ( 32.0%)   0.6180 ( 27.2%)  46.0591 ( 32.0%)  46.0432 ( 32.0%)  delete-unreachable -- DomTree
     3.8668 (  2.7%)   0.1652 (  7.3%)   4.0320 (  2.8%)   4.0172 (  2.8%)  insert-unreachable -- DomTree
     1.9094 (  1.3%)   0.0562 (  2.5%)   1.9657 (  1.4%)   1.9640 (  1.4%)  insert-reachable -- DomTree
    141.8558 (100.0%)   2.2700 (100.0%)  144.1258 (100.0%)  144.0620 (100.0%)  Total

But if I schedule edge insertions before deletions, I get

  ===-------------------------------------------------------------------------===
                                DomTree Calculation
  ===-------------------------------------------------------------------------===
    Total Execution Time: 95.0652 seconds (95.0225 wall clock)
  
     ---User Time---   --System Time--   --User+System--   ---Wall Time---  --- Name ---
    91.7329 ( 98.4%)   1.7553 ( 95.1%)  93.4882 ( 98.3%)  93.4364 ( 98.3%)  delete-reachable -- DomTree
     1.2455 (  1.3%)   0.0733 (  4.0%)   1.3188 (  1.4%)   1.3275 (  1.4%)  insert-unreachable -- DomTree
     0.1300 (  0.1%)   0.0074 (  0.4%)   0.1374 (  0.1%)   0.1376 (  0.1%)  delete-unreachable -- DomTree
     0.1107 (  0.1%)   0.0102 (  0.6%)   0.1209 (  0.1%)   0.1210 (  0.1%)  insert-reachable -- DomTree
    93.2191 (100.0%)   1.8462 (100.0%)  95.0652 (100.0%)  95.0225 (100.0%)  Total

Though I don't have bitcode samples to reproduce your findings, I guess `three times faster` is caused by the DomTree is consuming a lot of time in `DeleteUnreachable()`.

Maybe we can put some high level APIs that have these best practices inside DTU such as something like `updateAfterSpliceBlocks` and `updateAfterChangePredecessorTo` or just add a `sort` after running updates legalization. :)


Repository:
  rL LLVM

https://reviews.llvm.org/D54730





More information about the llvm-commits mailing list