[PATCH] D36432: [IPSCCP] Add function specialization ability

Wed Aug 9 13:07:28 PDT 2017

dberlin added a comment.

In https://reviews.llvm.org/D36432#836883, @mssimpso wrote:

> In https://reviews.llvm.org/D36432#835343, @davide wrote:
>
> > Very neat work, thanks, this was in my todolist for a while :)
> >  Some meta comments, I'll try to find some time to review this more carefully this week.
>
>
> Thanks for taking a look!
>
> > I wonder if we're at a point where it makes sense to split the lattice solver from the transforms, as with your partial specialization code SCCP is growing a lot :)
> >  This would have also a bunch of other advantages, e.g. we could more easily try to plug arbitrary lattices (for example for variables/ranges/bits).
> >  IIRC Chris Lattner started a propagation enging a while ago, it should still be in tree, but unused.  What are your thoughts?
>
> Yes, I think I see what you're talking about. Analysis/SparsePropagation.h? I haven't looked at this closely yet, but I can see some advantages to using it. Any idea why SCCP hasn't already been switched over to use it?
>
> > 

Same as anything else - lack of time for someone ;)

We moved GCC' s SCCP to a generic sparse propagation engine.
As for performance - i'm with Davide.
Cloning as the end transformation is nice, but it also tends to be a harder cost model to get right.

GCC was unable to get this right in practice.  Most benefit from from IPSCCP is producing range info for each context.

> Am I right in thinking the comment about jump functions is about our IPSCCP infrastructure in general, rather than specialization specifically? If I understand correctly, jump functions (like in Callahan et al.) are a trade-off between the precision of the analysis and the time/space needed to solve it. They summarize functions and then propagate the summaries through call sites. I think our current IPSCCP implementation should (at least in theory) be more precise than this. For example, if we find out via constant propagation that a call is not executed, this could enable more propagation/specialization, etc. But again, this is a trade-off since we may have to revisit a large portion of the module. I would definitely appreciate your thoughts.

Almost right.  Our modeling should be more precise than *most* jump functions. But not all, i believe. I think what we are doing is equivalent to passthrough in reality[1] .  Essentially, we have extended the constant prop lattice, for call functions, to to do passthrough of arguments and whatever their value is, and not just constants.
But you can imagine a more powerful jump function.  For example, the polynomial one.  It can be viewed as "extending the constant prop lattice for formal parameters to be polynomial functions", not just constants.

So even if foo(a) and foo(b) is foo(2) and foo(4), maybe in reality it could be expressed as foo(argument * 2) or something.  We would not get this. The polynomial jump function would.

So, you are right that as implemented by most(all?) compilers,  they use passthrough, and we use passthrough, so we're the same.

[1] I believe what we do can be proven equivalent to passthrough, because we merge the state of incoming arguments from the call sites parameters, without changing the lattice value.  If all arguments just pass through, we will pass through the state.  IE   Without us doing anything else, given a call chain of arguments of foo(a) ->bar(a)->bob(a), a will remain underdefined in our algorithm.

Note: Wegmans SCCP paper covers a variant of IPSCCP.
https://www.cs.utexas.edu/users/lin/cs380c/wegman.pdf
They just link all the SSA procedures together, and run IPSCCP on it. What we do should be identical in practice, i believe.

It should be linear in the total size of the program (or we screwed up :P)
It also talks about integrating it with inlining.

https://reviews.llvm.org/D36432