[PATCH] D35851: [Dominators] Include infinite loops in PostDominatorTree

Wed Aug 2 10:46:31 PDT 2017

grosser added a comment.

In https://reviews.llvm.org/D35851#828525, @kuhar wrote:

> @grosser
>
> Tobias,
>
> Before documenting my code more, let me answer your questions and comment on your counterproposal.

Very good idea.

>> this is not well documented in the literature. The paper you based the dominator tree construction on ("An Experimental Study of Dynamic Dominators") seems to very clearly distinguish between unreachable and reachable CFG nodes.
> 
> I think that it’s pretty convenient to assume that functions always have a single exit for academic purposes. If this was the case in LLVM, we could build postdominators on reverse CFG and literally be done at that point. But that’s not the case, and there’s a real life difference between forward-unreachable and reverse-unreachable code. There’s no way to end up in forward-unreachable code, because we always start executing code from the (single) entry node. Whereas when it comes to reverse-unreachable code, executing it happens in practice and we actually care about optimizing such code. Many embedded systems consist of a single large infinite loop that reads some memory and writes data somewhere else. And it is actually possible to exit the infinite loop there, just by calling a function with a call to exit inside. IMHO it is not much different from dynamically infinite loops, which are naturally modeled by postdominators. Because of that, it seems entirely reasonable to me to include reverse-unreachable in the PostDominatorTree.

While I currently do not have any benchmark where optimizing infinite loops is useful (is there one I could have a look?), let's working under the assumption that it actually is. Chandler and Daniel raised the point that just the fact that the PDT contains all nodes makes user code simpler, which is another argument for this kind of modeling. In general, both points together make a good motivation to include all nodes in the PDT. I agree, assuming we understand and are happy with the resulting implications.

>> that it is in practice desirable for some reason (which?)
> 
> The main motivation for modeling reverse-unreachable code and (statically) infinite loops in the PostDominatorTree is safely sinking code and optimizing code within infinite loops. To do that, you cannot pretend that reverse-reachable code dominates code that can branch to an infinite loop, which I find particularly problematic in your proposal.
>  F3974694: Post_Dominance_(3).png <https://reviews.llvm.org/F3974694>
> 
> Let’s consider one of your examples. In this case your implantation says that B immediately postdominates D, even though it is possible to branch to C and keep looping there until something exits the program. It would seem valid to sink instruction from B to D, which would not be the case if C had a dynamically unreachable exit.

Thank you for coming up with an actual code transformation.

I assume you mean D postdom B (rather than B postdom D).

You say the fact that D postdom B is by itself sufficient to allow code sinking from B to D. Remember the original program:

F3996712: Post Dominance.png <https://reviews.llvm.org/F3996712>

which, as in https://reviews.llvm.org/D36107, may terminate in C with "br i1 true, label %C, label %B". To my understanding there is **dynamically** no difference between the original code and the code after dropping the edge C -> B. Both have the property that B postdom D. Would code sinking from B to D be in your opinion valid in the original program?

> This example is more about profitability, but the real problem would be hoisting here, even assuming that there are no functions call in B, C, or D. Say there is instruction that can cause undefined behavior in D, even something as simple like divide by zero. D immediately postdominates B and is its successor, so it would seem that hoisting that instruction to B is safe (assuming that we already proved that it is from other standpoints), and that instruction wouldn’t normally be executed if the control flow entered the infinite loop C. The same applies to hoisting loads and stores from D to B.

The same question as above: what prevents you to hoist these very instructions in the original program with a dynamically infinite loop?

AFAIU X postdom Y does not guarantee that after Y is executed X is at some later point certainly executed.

> The other problem I see is that your definition of postdominance for reverse-unreachable code doesn’t lead to useful regions, if I understand it correctly. Regions in LLVM are defined as single-entry single-exit parts of CFG where the entry dominates every node in the region and the exit postdominates every node in it. For your example, it seems that there would be 2 regions: ABD and C, which doesn’t seem correct to me, as in practice you can jump from B, get stuck and then exit through C. Using the postdominance as defined https://reviews.llvm.org/D35851 doesn’t have this problem and results in 3 regions: AB, C, and D.

Disclosure. I added the region test cases and I believe they are indeed useful the way they are. ;-)

I assume you  mean "you can jump from B **to C**".

Regions for fully reverse reachable CFGs can be well defined through (post-)dominance. As we try to find a post-dominance definition that is consistent with what we expect, I prefer to define regions as a subgraph of the CFG that is connected to the CFG with a single incoming edge and a single outgoing edge.

Do we agree that the region for the original program F3994838: region.png <https://reviews.llvm.org/F3994838> computed today makes sense? Similar to the example above, by dropping an edge that is known to not be executed, we just restrict the set of program executions the PDT can assume. No new incoming and outcoming edges are added:

F3998449: region.png <https://reviews.llvm.org/F3998449>

So from this perspective I would assume the new region, to still be a region. Would you agree?

Today we achieve this result by checking that the entry dominates the exit and the exit post-dominates the entry. To check if a node is contained in the region, we check if it is dominated by entry and not dominated by exit. Post-dominance for reverse-unreachable blocks is unfortunately not very useful here for the reason you stated above. Another option was to put the virtual exit edge not to the actual exit, but to the node that connect reverse unreachable and reverse reachable parts of the CFG. This would address this issue in most common cases. If this is preferable, I could port this patch to your most recent code.

> Saying that reverse-unreachable code never postdominates reverse-reachable code is like saying infinite loops, even with side effects, has undefined behavior and we assume to never execute it.
> 
>> You define the parent property as: "If a node gets disconnected from the graph, then all of the nodes it dominated previously will now become unreachable." After this patch, nodes that become unreachable are still connected to the graph.
> 
> 
> 
>> Interestingly, the property D postdom B does not hold any more. To my understanding, in a complete (post) dominator tree, removing an edge should never weaken a normal dominatore tree. Is this true? Do we loose this property by supporting reverse-unreachable CFG nodes?
> 
> This may be a little bit unintuitive at first

Then it certainly warrants some documentation. Thank you for explaining!

> but here’s what happens when you call .deleteEdge with my patch: it first performs the edge deletion and can then immediately makes an insertion from the virtual exit to the reverse-unreachable code. From the end user’s perspective, those two steps happen atomically, but in reality it performs two different operations with an intermediate result not observable from outside. If you consider these two internal steps in terms of the parent and sibling property, the first one holds after deletion and the second one after insertion.

Interesting! Do I understand correctly that for  a normal compete (post)dominator tree these two steps fall together. Hence, both parent and sibling property hold always?

>> This is not visible in verifyParentProperty, as it does not update Roots properly
> 
> Verifying the parent property doesn’t really depend on some information internal to the DomTree -- you could place a .verifyParentProperty call in the middle of deletion and it would check it fine. The only thing that .verify would disagree with would be the new root not present in the Roots yet, but that’s kind of expected here, and the deletion and insertion happen atomically anyway. In this respect, neither deletion nor verification seems broken to me.

Thank you for explaining the above. Yes, verify is verifying under the assumption that no virtual edges are added. Without virtual edges, the properties certainly hold. This difference is again something interesting to point out.

Best,
Tobias

https://reviews.llvm.org/D35851