[llvm-dev] RFC Storing BB order in llvm::Instruction for faster local dominance

Wed Sep 26 22:24:41 PDT 2018

> On Sep 26, 2018, at 11:55 AM, Reid Kleckner <rnk at google.com> wrote:
> 
>> 
>> As suggested in the bug, if we were to rewrite these passes to use MemorySSA, this bottleneck would go away. I rebased a patch to do that for DSE, but finishing it off and enabling it by default is probably out of scope for me.
> 
> Yes, understood, that would be a major change.  That said, changing the entire architecture of the compiler to work around this isn’t really appealing to me, I’d rather limit the problematic optimizations on the crazy large cases.
> 
> You're probably right, these passes probably aren't actually firing. But, it sounds like we have two options:
> 1. Add cutoffs to poorly understood slow passes that are misbehaving
> 2. Make a core data structure change to make dominance calculations faster and simpler for all transforms
> 
> I can do this analysis, and add these cutoffs, but I wouldn't feel very good about it. It adds code complexity, we'll have to test it, and tomorrow someone will add new quadratic dominance queries. I don't see how heuristically limiting misbehaving optimizations builds a better foundation for LLVM tomorrow.

My answer to this is that the long term path is to move these passes to MemorySSA, which doesn’t have these problems.  At which point, the core IR change you are proposing becomes really the wrong thing.

> Dominance has been and will always be an important analysis in LLVM. This patch is basically pushing an analysis cache down into IR. Would it be accurate to say that your objection to this approach has more to do with analysis data living in IR data structures?
> 
> If I added more infrastructure to invert the dependency, store these instruction numbers in DominatorTree, and make BasicBlock notify DominatorTree of instruction insertion and deletion, would that be preferable? That's very similar to the code we're writing today, where the transform takes responsibility for marrying its modifications to its invalidations. The question is, do we want to keep this stuff up to date automatically, like we do for use lists, or is this something we want to maintain on the side? I think dominance is something we might want to start pushing down into IR.

I have several objections to caches like this, and we have been through many failed attempts at making analyses “autoadapt” to loosely coupled transformations (e.g. the CallbackVH stuff, sigh).  My objections are things like:

1) Caching and invalidation are very difficult to get right.
2) Invalidation logic slows down everyone even if they aren’t using the cache (your “check to see if I need to invalidate when permuting the ilist).
3) We try very hard not to put analysis/xform specific gunk into core IR types, because it punishes everyone but benefits only a few passes.
4) LLVM is used for a LOT of widely varying use cases - clang is just one client, and many of them don’t run these passes.  This change pessimizes all those clients, particularly the most sensitive 32-bit systems.
5) If done wrong, these caches can lead break invariants that LLVM optimization passes are supposed to maintain.  For example, if you do “sparse numbering” then the actual numbering will depend on the series of transformations that happen, and if someone accidentally uses the numbers in the wrong way, you could make “opt -pass1 -pass2” behave differently than “opt -pass1 | opt -pass2”.

Putting this stuff into DominatorTree could make sense, but then clients of dominator tree would have to invalidate this when doing transformations.  If you put the invalidation logic into ilist, then many of the problems above reoccur.  I don’t think (but don’t know) that it is unreasonable to make all clients of DT invalidate this manually.

-Chris

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180926/8d0b8392/attachment.html>