[PATCH] D35851: [Dominators] Include infinite loops in PostDominatorTree

Tue Aug 1 08:05:11 PDT 2017

grosser added a comment.

Hi @kuhar, @dberlin,

thanks for giving me time to look into this and especially thanks for adding dynamic update support to DT/PDT. This new interface is indeed very nice!

As you know I am concerned about some of the implications of connecting infinite loops to the virtual exit (see below). But assuming this is the way to go, the structure of this patch looks OK. I just found a couple of minor typos and would appreciate some more documentation.

Specifically, while there is precedence for extending the post-dominance tree to reverse unreachable nodes the way you propose, this is not well documented in the literature. The paper you based the dominator tree construction on ("An Experimental Study of Dynamic Dominators") seems to very clearly distinguish between unreachable and reachable CFG nodes. Whereas in the new code there won't be any unreachable nodes any more. I think it would be nice to document these differences, the reason why this choice has been taken, and the impact this has on guarantees the PDT interface gives to the user.

The changes this patch introduces to https://reviews.llvm.org/D36107 will already add some documentation.

Daniel Berlin wrote some motivation in his old commit and we afterwards did quite some research that brought up both arguments for and against this change. Especially as the goal is to make the use of the PDT a lot more common, I think it would be great to document the motivation and reasoning both in the commit message and in the patch itself. Could you e.g. briefly list the other compilers that take this approach (I already looked them up for you [https://docs.google.com/document/d/1lL8xcqfnqNArvQj3TpgEXOTybr8kucagjzqvChQCUGI/edit]), clarify that this is indeed not a common extension in the literature but that it is in practice desirable for some reason (which?). I also would really appreciate a brief comment in situations where you had to adapt the original dynamic dominators algorithm to support this use case. This will certainly help others who might work on PDT in the future.

I also have two questions:

1) Does connecting edges to the virtual exit break the parent property?
=======================================================================

You define the parent property as: "If a node gets disconnected from the graph, then all of the nodes it dominated previously will now become unreachable."  After this patch, nodes that become unreachable are still connected to the graph. This is not visible in verifyParentProperty, as it does not update Roots properly, but my understanding is that this is indeed the case and a breakage you are willing to accept to ensure that all nodes are part of the CFG. Is this correct?

2) Can removing an edge from a dominator tree weaken the dominance relation?
============================================================================

The example I am thinking of is the one in https://reviews.llvm.org/D36107.

Assume we start with a normal (complete) CFG:

F3887544: Post Dominance.png <https://reviews.llvm.org/F3887544>

and then drop the edge C -> B.

Today + Literature (leave C unreachable)
----------------------------------------

The output we get today (and which would be computed by "An Experimental Study of Dynamic Dominators") would be the following graph with C  as now being reverse unreachable not part of the graph:

F3888676: Post Dominance (2).png <https://reviews.llvm.org/F3888676>

The relation D postdom B is unaffected. This is to my understanding in line with the definition of post-dominance.

Proposed Patch + GCC (Connecting C to virtual exit)
---------------------------------------------------

This patch will now instead create the following PDT (grey edges have been removed compared to original PDT and are not part of new PDT)

F3889049: Post Dominance (1).png <https://reviews.llvm.org/F3889049>

Interestingly, the property D postdom B does not hold any more. To my understanding, in a complete (post) dominator tree, removing an edge should never weaken a normal dominatore tree. Is this true? Do we loose this property by supporting reverse-unreachable CFG nodes? (I am aware there are certain trade-offs we need to take, but I would like to understand and at best document them)

I believe the issues above are worth documenting.

While thinking about this problem, I experimented with connecting unreachables to the virtual exit, but dropping their connection to the reachable part of the CFG when constructing the Post-Dominator tree (https://reviews.llvm.org/D36135). Similarly to the proposed patch, we change the definition of what post-dominance means to include unreachable nodes, but as the edges we ignore can anyhow never be reached on the reverse CFG (they only became reachable through our virtual edges) this seems like a reasonable extension.

The resulting  tree matches the dominator tree of the reachable part, but also contains all unreachable nodes:

F3888465: Post Dominance (3).png <https://reviews.llvm.org/F3888465>

If you compare the changed test cases, a lot of the regressions we see with https://reviews.llvm.org/D35851 disappear while still all nodes are part of the PDT. I managed to port all existing code to this new approach, but maybe I missed something obvious? I wonder if we have some actual test cases that would break with the above implementation? Is there any transformation pass which would become incorrect with this PDT?

================
Comment at: include/llvm/Support/GenericDomTreeConstruction.h:267
+  // node that serves as a single exit from the function. All the other exits
+  // (CFG nodes with termninators and nodes in infinite loops are logically
+  // connected to this virtual CFG exit node).
----------------
terminators

================
Comment at: include/llvm/Support/GenericDomTreeConstruction.h:1012
+  // Check if the tree has correct roots. A DominatorTree always has a single
+  // root which is the function's entry node. A PostdDominatorTree can have
+  // multiple roots - one for each node with no successors and for infinite
----------------
PostDominatorTree

================
Comment at: include/llvm/Support/GenericDomTreeConstruction.h:1202

+      DEBUG(dbgs() << "Verifying parrent property of node "
+                   << BlockNamePrinter(TN) << "\n");
----------------
parent 

================
Comment at: test/Transforms/StructurizeCFG/no-branch-to-entry.ll:1
+; XFAIL: *
 ; RUN: opt -S -o - -structurizecfg -verify-dom-info < %s | FileCheck %s
----------------
dberlin wrote:
> grosser wrote:
> > Why was this XFAILed? Could you potentially add a comment why this test cases does not work any more?
> This is probably copied from my original patch.
> 
> It used to generate a region that caused it to try to replace the entry block, but it does not anymore.
> I had left it xfailed and pinged the structurizer maintainer to see if he could generate a case where it still tried to replace the entry block.
> I believe it to not be possible anymore, but i couldn't be sure.
> 
> So the short version is: "This test will probably end up removed, but i wanted to give it a little time"
Interesting. @kuhar: It probably makes sense to add a comment to the test cases, that it intended to be removed.

https://reviews.llvm.org/D35851