<html><head><meta http-equiv="Content-Type" content="text/html; charset=utf-8"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;" class=""><br class=""><div><br class=""><blockquote type="cite" class=""><div class="">On Sep 19, 2018, at 1:30 PM, Reid Kleckner via llvm-dev <<a href="mailto:llvm-dev@lists.llvm.org" class="">llvm-dev@lists.llvm.org</a>> wrote:</div><br class="Apple-interchange-newline"><div class=""><div dir="ltr" class=""><div dir="ltr" class=""><div class="">Hi folks,</div><div class=""><br class=""></div><div class="">I looked into some slow compiles and filed <a href="https://bugs.llvm.org/show_bug.cgi?id=38829" class="">https://bugs.llvm.org/show_bug.cgi?id=38829</a>. The gist of it is that we spend a lot of time iterating basic blocks to compute local dominance, i.e. given two instructions in the same BB, which comes first.<br class=""></div><div dir="ltr" class=""><br class=""></div><div class="">LLVM already has a tool, OrderedBasicBlock, which attempts to address this problem by building a lazy mapping from Instruction* to position. The problem is that cache invalidation is hard. If we don't cache orderings at a high enough level, our transformations become O(n^2). If we cache them too much and insert instructions without renumbering the BB, we get miscompiles. My solution is to hook into the actual BB ilist modification methods, so that we can have greater confidence that our cache invalidation is correct.</div><div dir="ltr" class=""><br class=""></div><div dir="ltr" class="">I created a patch for this at <a href="https://reviews.llvm.org/D51664" class="">https://reviews.llvm.org/D51664</a>, which adds a lazily calculated position integer to every llvm::Instruction. I stole a bit from BasicBlock's Value subclass data to indicate whether the orders are valid.</div><div dir="ltr" class=""><br class=""></div><div class="">Hopefully everyone agrees that this a reasonable direction. I just figured I should announce this IR data structure change to the -dev list. :)</div></div></div></div></blockquote><br class=""></div><div>I haven’t had a chance to look at the patch in detail yet (hopefully this afternoon) but this sounds like a very invasive change to a core data structure.</div><div><br class=""></div><div>The inner loop of the local dominance check in DominatorTree::dominates is also not very well implemented: it does a single linear pass from the beginning of the block until it finds the def or user.  A better algorithm would be to use two pointers - one at the user and def.  Each time through the loop, move the user iterator “up” the block, and the def iterator “down” the block.  Either the iterators meet each other (in which case return true) or you fine the beginning/end of the block.</div><div><br class=""></div><div>This should work a lot better for many queries, because it will be efficient when the user and def are close to each other, as well as being efficient when the value is at the end of the block.  Also, my bet is that most local dom queries return true.</div><div><br class=""></div><div>Have you tried this approach?  It should be very easy to hack up to try out on your use case.</div><div><br class=""></div><div>-Chris</div><div><br class=""></div><br class=""></body></html>