[PATCH] D36432: [IPSCCP] Add function specialization ability

Thu Aug 10 00:16:30 PDT 2017

davide added a comment.

In https://reviews.llvm.org/D36432#837245, @dberlin wrote:

> In https://reviews.llvm.org/D36432#836883, @mssimpso wrote:
>
> > In https://reviews.llvm.org/D36432#835343, @davide wrote:
> >
> > > Very neat work, thanks, this was in my todolist for a while :)
> > >  Some meta comments, I'll try to find some time to review this more carefully this week.
> >
> >
> > Thanks for taking a look!
> >
> > > I wonder if we're at a point where it makes sense to split the lattice solver from the transforms, as with your partial specialization code SCCP is growing a lot :)
> > >  This would have also a bunch of other advantages, e.g. we could more easily try to plug arbitrary lattices (for example for variables/ranges/bits).
> > >  IIRC Chris Lattner started a propagation enging a while ago, it should still be in tree, but unused.  What are your thoughts?
> >
> > Yes, I think I see what you're talking about. Analysis/SparsePropagation.h? I haven't looked at this closely yet, but I can see some advantages to using it. Any idea why SCCP hasn't already been switched over to use it?
> >
> > > 
>
>
> Same as anything else - lack of time for someone ;)
>
> We moved GCC' s SCCP to a generic sparse propagation engine.
>  As for performance - i'm with Davide.
>  Cloning as the end transformation is nice, but it also tends to be a harder cost model to get right.

A (somehow) related bug came to my mind https://bugs.llvm.org/show_bug.cgi?id=33253

> GCC was unable to get this right in practice.  Most benefit from from IPSCCP is producing range info for each context.
> 
>> Am I right in thinking the comment about jump functions is about our IPSCCP infrastructure in general, rather than specialization specifically? If I understand correctly, jump functions (like in Callahan et al.) are a trade-off between the precision of the analysis and the time/space needed to solve it. They summarize functions and then propagate the summaries through call sites. I think our current IPSCCP implementation should (at least in theory) be more precise than this. For example, if we find out via constant propagation that a call is not executed, this could enable more propagation/specialization, etc. But again, this is a trade-off since we may have to revisit a large portion of the module. I would definitely appreciate your thoughts.
> 
> Almost right.  Our modeling should be more precise than *most* jump functions. But not all, i believe. I think what we are doing is equivalent to passthrough in reality[1] .  Essentially, we have extended the constant prop lattice, for call functions, to to do passthrough of arguments and whatever their value is, and not just constants.
>  But you can imagine a more powerful jump function.  For example, the polynomial one.  It can be viewed as "extending the constant prop lattice for formal parameters to be polynomial functions", not just constants.
> 
> So even if foo(a) and foo(b) is foo(2) and foo(4), maybe in reality it could be expressed as foo(argument * 2) or something.  We would not get this. The polynomial jump function would.
> 
> So, you are right that as implemented by most(all?) compilers,  they use passthrough, and we use passthrough, so we're the same.

Yes. When I read "the" paper about jump functions,  https://scholarship.rice.edu/handle/1911/13733 I was under the impression that more sophisticated jump function doesn't actually buy you much (and in a brief chat with David Callahan [who was around at the time] I got a confirmation). That said, this was a long time ago, and many things changed, so maybe things can be re-evaluated at some point.

FWIW, GCC doesn't even implement return jump functions (see e.g. https://godbolt.org/g/eahtHR ) and relies on inlining/cloning to get things right, so in this respect llvm's constant propagation is actually stronger than GCC's [if I remember correctly the reason why they don't implement this is because it's a little complicated to handle the case of mutually recursive functions while walking down the DAG of SCCs, but if we'll ever go for real jump functions we should consider to implement return(s) as well).

> [1] I believe what we do can be proven equivalent to passthrough, because we merge the state of incoming arguments from the call sites parameters, without changing the lattice value.  If all arguments just pass through, we will pass through the state.  IE   Without us doing anything else, given a call chain of arguments of foo(a) ->bar(a)->bob(a), a will remain underdefined in our algorithm.
> 
> Note: Wegmans SCCP paper covers a variant of IPSCCP.
>  https://www.cs.utexas.edu/users/lin/cs380c/wegman.pdf
>  They just link all the SSA procedures together, and run IPSCCP on it. What we do should be identical in practice, i believe.
> 
> It should be linear in the total size of the program (or we screwed up :P)
>  It also talks about integrating it with inlining.

I agree this is a good path forward.

https://reviews.llvm.org/D36432