[PATCH] D21464: [PM] WIP: Introduce basic update capabilities to the new PM's CGSCC pass manager, including both plumbing and logic to handle function pass updates.

Wed Jun 29 17:42:18 PDT 2016

On Wed, Jun 29, 2016 at 1:16 AM Sean Silva via llvm-commits <
llvm-commits at lists.llvm.org> wrote:

> On Wed, Jun 29, 2016 at 12:03 AM, Xinliang David Li <davidxl at google.com>
> wrote:
>
>> On Tue, Jun 28, 2016 at 5:41 PM, Chandler Carruth <chandlerc at gmail.com>
>> wrote:
>>
>>> As an example of the kind of missed optimizations, I'm going to add a
>>> test case where we successfully deduce readnone with the new pass manager
>>> but can't (no matter how many iterations we make) deduce it with the old
>>> one. This is because we don't refine the SCC graph structure, and do
>>> principled visitation to that graph structure.
>>>
>>
>> I saw the test case. The question is there anything preventing SCC graph
>> be refined for the second iteration in the old framework?
>>
>
> I tested this with my proof of concept port of the old visit order to the
> new pass manager [*]. With the following tiny modification of
> RefreshCallGraph, the port passes both of Chandler's tests:
> http://reviews.llvm.org/P6012
>

As you mentioned, I had over-simplified my test cases, and they weren't
testing what I was describing in email, sorry about that. They no longer
required a *refined* SCC graph.

I've updated the test cases to actually check the refined SCC graph, and
even with the patch you mentioned, the old pass manager can't figure this
out.

Of course, there is a way to get this with the old pass manager. You have
to re-build the call graph. But that requires a full re-run of the
bottom-up walk. You can see this by doing 'opt -functionattrs -simplifycfg
-functionattrs' which will get this case, but will build an entirely new
call graph to do so.

What I'm trying to point out is a fundamental limitation that is addressed
by doing on-line update and refinement of the graph without re-starting the
entire pipeline. Supporting that + identity based caching for the new pass
manager were the two driving motivations for the design of the
LazyCallGraph and I continue to think having this functionality is really
essential.

> In fact, I even had to modify cgscc-observe-devirt.ll because the old
> visitation order (with the small patch) is able to infer readnone on
> test1_b (Chandler's patch has a FIXME with "AFTER2" as the desired
> behavior).
>

Yes, I haven't added the devirtualization trigger that the old pass manager
has. I can add it easily in a follow-up patch, but it is orthogonal to this
one as it doesn't actually require call graph structural updates, it just
requires detecting indirect -> direct call transformations.

If its useful, I'm happy to prioritize implementing that and show you what
it would look like in this model. It is pretty easy to do. It mostly
requires a trivial recreation of an extra data structure that the old Call
Graph provides (at great cost) that has a value handle for all the indirect
calls so we can notice when they change.

>
> The test cases could be probably be modified so that the old visitation
> logic (or small modifications thereof) would fail yet Chandler's proposed
> logic would succeed. But the test cases seem to get increasingly
> far-fetched.
>

I don't think the modifications are terribly far fetched. All it requires
is for it to involve SCC refinement.

> Regardless, if/when the evidence appears for those cases being important
> to optimize, we can add them; porting to the new PM is orthogonal.
>
> [*] Hopefully I will start an RFC and send out a proof of concept patch
> tomorrow; I just need to test out a proof of concept port of the inliner on
> something bigger to make sure everything works.
>

As I have said in this thread, several other threads when LazyCallGraph was
first designed, and in several other discussions, I continue to think this
is the wrong approach.

Not having an identity preserving call graph data structure will make the
new pass manager design flawed. I think we will have to add hacks to get
invalidation and preservation correct in that model, and will continue to
observe the fact that we cannot cache analyses on the actual SCCs.

While I understand wanting to pursue another option if we didn't have this
patch which implements the more principled logic, given that this patch is
out and I'm starting work on porting the inliner now, I'd somewhat like to
continue down this path unless there are specific insurmountable problems
that people see here.

> -- Sean Silva
>
>
>>
>> If we are really keen in solving this problem, another way is to do a
>> simple module pass to do the cleanup before the SCC is built. This will at
>> least work for the examples in the test.
>>
>>
>>
>>> The examples in the test case are somewhat contrived in order to make
>>> good test cases, but they hopefully represent the general pattern that I'm
>>> worried about here: we are using the set of functions in the SCC to reason
>>> about it as an SCC but not actually updating that set of functions when the
>>> code changes.
>>>
>>
>> I understand the motivation. What I want to know  is the
>> complexity/benefit trade-offs here. Using refined attributes due to
>> de-virtualization as example is not convincing IMO.
>>
>> thanks,
>>
>> David
>>
>>
>>>
>>> > The general idea is clearly to maintain a bottom-up visitation
>>> behavior, but since this patch doesn't implement edge addition (which
>>> represents devirtualization and so is pretty important) the intended
>>> behavior there is unclear.
>>>
>>>
>>> So, this *does* implement ref -> call promotion which is what I would
>>> expect devirtualization to look like in all cases. See below for details on
>>> your question.
>>>
>>> The other case is for "out-of-thin-air" call (or ref) insertion such as
>>> instrumentation passes might do. Currently this isn't supported but could
>>> be added in the future.
>>>
>>>
>>> ================
>>> Comment at: lib/Analysis/CGSCCPassManager.cpp:90
>>> @@ +89,3 @@
>>> +    for (Instruction &I : BB)
>>> +      if (auto CS = CallSite(&I))
>>> +        if (Function *Callee = CS.getCalledFunction())
>>> ----------------
>>> (in case reading this against the current patch, this originally was
>>> attached to the FIXME regarding handling adding new calls)
>>>
>>> As i somewhat alluded to above, what you describe should be handled by a
>>> ref edge turning into a call edge, and the update mechanism should be able
>>> to handle that well.
>>>
>>> I've added a test case that exercises this with GVN and function-attrs.
>>> There is currently a case missed in the new pass manager because we don't
>>> have the up-to-four iteration whenever an indirect call turns into a direct
>>> call heuristic that the old pass manager has. I'm happy to add that, but
>>> I'd like to add it in a follow-up patch. I've marked where in the test case
>>> this is missed, and I've demonstrated that in theory this update mechanism
>>> is sufficient to handle it by explicitly running function-attrs again and
>>> it correctly catches the refinement.
>>>
>>> The direct answer to your question is #2: it continues running on the
>>> now-larger SCC, detects that we switch from one SCC to another at some
>>> point, and re-runs on that SCC to make sure that the refined graph is
>>> observed.
>>>
>>> ================
>>> Comment at: lib/Analysis/CGSCCPassManager.cpp:170
>>> @@ +169,3 @@
>>> +
>>> +    // The easy case is when the target RefSCC is not this RefSCC. This
>>> is
>>> +    // only supported when the target RefSCC is a child of this RefSCC.
>>> ----------------
>>> sanjoy wrote:
>>> > Doesn't this also hold if the target SCC is not this SCC?
>>> Yes, and inside the implementation of switchInternalEdgeToRef, it early
>>> exits with an empty range when we hit that scenario.
>>>
>>> I can lift that distinction up into this code if you think that would be
>>> helpful, the somewhat arbitrary split was that these methods are on a
>>> RefSCC, and so it is the RefSCC-easy case that callers have to handle.
>>>
>>>
>>> http://reviews.llvm.org/D21464
>>>
>>>
>>>
>>>
>> _______________________________________________
> llvm-commits mailing list
> llvm-commits at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20160630/401e0495/attachment.html>