[PATCH] D46893: [CVP] Require DomTree for new Pass Manager

Fri May 18 10:18:56 PDT 2018

chandlerc added a comment.

In https://reviews.llvm.org/D46893#1101272, @dberlin wrote:

> So, this seems very confused.
>  getBestSimplifyQuery simply queries the analysis manager to see what passes are available.
>
> This was used to replace something even *more* vague, where passes were randomly plumbing or not plumbing pieces they had to simplifyInstruction various places, and were doing it quite wrong.
>  It was generally considered a significant improvement over what existed before.
>  You can look at what it replaced and make that decision,.
>
> All that said,  it does not *require* anything.
>  Nor does SimplifyInstruction, which was deliberately built to not require a DomTree to function properly, and is used in places where it is not up to date or correct.
>  Hence, getBestSimplifyQuery  uses what is available, as does the SimplifyQuery structure and SimplifyInstruction.
>
> It is only meant to be used with SI, and using it elsewhere is a bad plan.
>
> Otherwise, i disagree with Chandler strongly to go back to constructing SimplifyQuery's directly.  It was wrong often enough that this was clearly not a good way of doing things, IMHO.

I think we're miscommunicating about this.

I'm not necessarily suggesting constructing SimplifyQuery directly. I'm not trying to suggest any previous API is good.

I'm trying to point out that we're going to have a debugging and stability problem if we rely heavily on getting "available" passes in the new PM that is significantly different from the old PM. I'll try to explain the underlying problem here in a bit more detail so that we're on the same page.

In the old PM, getting the aviailable passes was a really nice approach. Because the schedule of pass runs was decided up-front, getting available analyses essentially said "whatever the pass pipeline looks like, use what is there". This is a great optimization and avoids lots of overly tight coupling between passes and their schedule. But the results were *extremely predictable* -- there would be exactly one set of analyses available for a pass in a given position in the pipeline. I see zero problems here, and the API we're discussing does indeed seem way, way better than manually building SimplifyQuery.

But in the new PM, we have a very different situation. Because passes are cached, the "available" thing is really subtle. It isn't static, it is dynamic. One module may have that analysis available, while a *very* subtly different module won't. You can even have action at a distance with this, where one function changes in the module, and on a completely unrelated function the analysis is no longer available. I see at least three issues with this:

1. It will make it very hard to debug missed optimizations because it is much more complicated to reason about. For example, it will make bugpoint reduction of failures substantially harder as now you'll end up needing to preserve lots of very inexplicable things to get particular behaviors to occur
2. If we ever want to add memory usage limiting functionality to the new PM (which everyone was expecting in the early days but hasn't been needed thus far), those memory saving clears of the cache *will change optimization power*.
3. It may in rare cases hurt users where seemingly benign refactorings and changes cause the optimizer to improve and regress in ways they can't understand and don't expect.

Now, all I'm trying to say here is that getting available analyses in the new PM is substantially more risky than in the old PM. I'm not suggesting any particular API approach. Just saying that in the new PM, especially with analyses being somewhat cheaper due to caching, we should be skewing much, much further towards requiring analyses rather than getting them if available.

That still could mean the raw SimplifyQuery API is bad, and that we need a better API. Totally on board with that. I'm just hoping that, to the extent possible, we can design APIs that actually require the analyses in the new PM rather than using whatever happens to be sitting in the cache.

And that also doesn't mean this patch isn't an incremental improvement over the prior state necessarily. I'm perfectly happy if we need to fix this in a follow-up. I just want people to be aware of and understand the risk and issues here.

https://reviews.llvm.org/D46893