[PATCH] D29865: [PDSE] Add a no-op pass.

Mon Feb 13 17:37:31 PST 2017

dberlin added a comment.

In https://reviews.llvm.org/D29865#675698, @bryant wrote:

> In https://reviews.llvm.org/D29865#674647, @dberlin wrote:
>
> > In https://reviews.llvm.org/D29865#674642, @davide wrote:
> >
> > > Also, it would be great if you can provide real examples of where this helps (e.g. how often do you expect this to trigger in practice, and why it matters).
> >
> >
> > I can provide cases :)
> >
> > There are a lot of them.
> >  It's PRE for stores.
> >
> > > I assume you experimented enough with this to have a complete'ish pass somewhere out-of-tree. If so, did you run this on something to measure the impact?
> >
> > I mentioned a lot of this to bryant yesterday, that these are things that we'd have to work through.
> >
> > > Side question: do you want this to be run as part of the default pipeline eventually? What's the compile time cost?
>
>
> It should run faster than the existing DSE, which does a lot of quadratic complexity clobber checks. PDSE only makes linear time passes (apart from the initial pass that collects and categorizes occurrences, which does a quadratic number of alias checks).
>
> > Staring at it, i'm pretty confident the compile time cost can be  made minimal
> >  The biggest cost is going to be the IDF calculation.
>
> Some additional avenues for speed-ups:
>
> - I think I can re-use MemorySSA's AccessLists (instead of doing an initial roll through every single instruction like https://reviews.llvm.org/D29866 does now), since it already includes may-throws. I just need to walk each BB's list in reverse. The down side is that store insertions are a bit trickier since the MSSA structure ought to be preserved.
> - For lambda insertion, the def blocks need to be computed. This means make an additional pass through the function to search for kill-only blocks (since real occurrence blocks are already known), but https://reviews.llvm.org/D29866 makes this extra pass **per occurrence class**. I think this could instead be done in one pass for all classes.

It is possible to do it once per pass at the cost of over-computing the set, and ending up with useless lambdas that you need to eliminate.
Otherwise, you can also just do it on the fly using the dual of http://www.cdl.uni-saarland.de/papers/bbhlmz13cc.pdf

Repository:
  rL LLVM

https://reviews.llvm.org/D29865