[llvm] r296535 - Fix PR 24415 (at least), by making our post-dominator tree behavior sane.

Mon Mar 13 11:58:20 PDT 2017

Ugh, hit send early.
Dammit gmail.

On Mon, Mar 13, 2017 at 11:38 AM, Daniel Berlin <dberlin at dberlin.org> wrote:

>
>
> On Mon, Mar 13, 2017 at 7:51 AM, Tobias Grosser <
> tobias.grosser at inf.ethz.ch> wrote:
>
>> Hi Daniel,
>>
>> thank you again for your patience!
>>
>> In summary, I agree that we want to have a complete post-dominance tree,
>> but I am still concerned about the "flattened" dominance relation. I
>> spent some time to understand if we can get a full post-dominance tree
>> without loosing precision and feel as if we can. Now, I am happy to
>> learn that I am wrong. In fact, at the very least, this discussion
>> collected a lot of background material which will make great source-code
>> comments.
>>
>> Let me start with a concrete question:
>>
>> # Does connecting the virtual edge to the last connecting
>> reverse-reachable basic block break any test case?
>>
>
> We don't currently use post-dom for real, because it is pretty broken.
> I can make it break ADCE if you like.
>
>
>>
>> In https://reviews.llvm.org/D30890, I slightly modified your patch
>> (might need some interface cleanups and another round of testing) to
>> connect infinite loops to the BB that attaches them to the CFG. The
>> modified patch passes alle existing tests only requiring the obvious
>> adjustments, works the way I expect on RegionInfo, and similarly to your
>> patch removes the ADCE hack. So at least within the test coverage we
>> have this solution does not uncover any problems. It also fixes two
>> bound-check elimination problems we had since a while in Julia and
>> most-importantly preserves the post-dominance relationship at the
>> connection points of reverse-unreachable code. Is there a specific
>> example LLVM-IR transformation where we lack test coverage, but which
>> would break today?
>>
>
> No, because we basically don't rely on post-dom info since ... as stated,
> it doesn't' work right :)
> Certainly you can see that the ADCE code we have now is actually a broken
> hack-around for the fact that the path invariant is not maintained.
>
>
>> # Theoretical Details
>>
>> LLVM currently does not include reverse-unreachable code in the
>> post-dominator tree.
>
>

Just to note, this is not correct.
LLVM essentially includes a random set of reverse-unreachable code in the
post-dom tree, and at least all noreturn functions.

It starts pathing from any node with no successors, so you can imagine an
infinite loop where at least one block goes off into a noreturn function or
unreachable, it will then include that infinite loop as well.

> I understand that there is a desire to ensure all
>> basic blocks are part of the post-dominator tree and I agree that we
>> want to make this happen!
>>
>> You aim for a single post-dominator tree (some kind of forest could be
>> an alternative). A sufficient solution to ensure a single post-dominator
>> tree is the augmentation of the CFG with virtual edges such that each
>> basic block is reverse reachable from a (virtual) exit node. I again
>> agree that we want to have a single post-dom tree and that this goal is
>> best reached by
>> augmenting the CFG. It is not yet clear to me why inserting connections
>> directly to the exiting basic block is the only valid choice.
>
>
> Because, IMHO, maintaining the path invariant for unreachable nodes as
> well is the only valid choice,and  as mentioned, optimizations we wish to
> implement depend on it.
>

>
> To
>> establish reverse reachability, it is AFAIU sufficient to insert edges
>> to any other node that is already reverse reachable. The (virtual) exit
>> node is (trivially) reverse reachable, this is true. However, AFAIU
>> inserting an edge to any other reverse reachable node would be
>> sufficient to ensure a complete tree. I assume you agree here.
>>
>
> One of the invariant of the dominator and post-dominator tree is that if x
> is a descendant of y in the post-dom tree, there is a path from y to x in
> the CFG.
>
> I do not see how you guarantee this invariant without connecting the
> distinct reverse-unreachable paths directly to exit.
>

IE i do not think it is a valid choice to say "we only maintain this
invariant for reachable nodes".

>
>
>>
>> Now you state that the only tree that makes sense to form is the one
>> where reverse-unreachable blocks are connected to the exit block,
>> because such a tree:
>>
>> 1) is more canonical
>>
>>    o "I strongly believe we would define it the same way everyone else
>>    does, which is to connect the backedge of the infinite loop
>>     to a virtual exit node. That is the canonical answer you will find
>>     everywhere, and there are good reasons for this."
>>    o "what we do now, post-patch is the canonical thing to do, taught in
>>    compiler textbooks, and used by every other compiler."
>>
>> 2) is the only "correct" solution
>>
>>    o "Did you consider connecting it somewhere else?", "Yes, because it
>>    is IMHO not valid from a correctness perspective, and this will cause
>>    bugs."
>>
>> 1) To understand what is canonical I went a little bit through the
>> literature (and the source of other compilers):
>>
>> https://docs.google.com/document/d/1lL8xcqfnqNArvQj3TpgEXOTy
>> br8kucagjzqvChQCUGI/edit
>
>
> Thgank you for doing this!
>
>
>>
>>
>> You are indeed right that several compilers (including gcc), but not
>> all, connect infinite loops to a canonical exit node. On the other side,
>> literature-wise it was hard to find this mentioned.
>
>
> Literature is often .... ignoring of the practical realities of compilers
> :)
>
>
>
>> Even though
>> post-dominators are known since Lowry 1969, the definitions I can find
>> always expect a CFG where all bbs are reverse-reachable.
>
>
> Correct.
>
>
>> This also holds
>> the the arxiv paper you cited,
>> which states: "Let G = (V, E, s) be a flow graph, and let T be a tree
>> rooted at s, with vertex set consisting of the vertices that are
>> reachable from s.". The only compiler textbook I could come up with that
>> discusses non reverse reachable infinite loops is  "High performance
>> compilers for parallel computing, Michael Wolfe" who describes precisely
>> our problem. He suggests as solutions either to create independent trees
>> or to introduce a phantom edge going e.g. to the exit node.
>>
>

So, instead of extending the same invariants we give reachable nodes, to
unreachable nodes, you want to do something different.
and again, i'm against that because it breaks stuff and doesn't give you
post-dom trees that meet the invariants of post-dom trees when extended to
unreachable nodes.  You can't just argue "well, those invariants are only
guaranteed for reachable nodes", because the whole point of this exercise
is to extend it to unreachable nodes :P

>
>> 2) Regarding correctness: The original definition of post-dominance
>> (Allen 1970) states "A node b_1 is said to forward dominate or post
>> dominate a node b k if b i is on every path from b,. to all exit nodes."
>>
>> Hence, for this very simple example:
>>
>>             start
>>                |
>>             first
>>              /   \
>> /\        /        bb
>> ^  \    /           \
>> ^    loop        exit
>>  \__/
>>
>> all paths from first to exit go through bb. Consequently bb
>> post-dominates exit. In the original papers non-reverse unreachable
>> nodes have not been considered, so they do not give us any insights how
>> such CFG parts should be handled.
>
>
>> Hence, one needs to define an extension that at best preserves this
>> original property.
>
>
> But you've missed an invariant this property guarantees, which is the path
> property, and just said "all i care about is the precision of the reachable
> nodes".
>
> I care more about the invariants, and optimizations care more about the
> invariants, than the precision of the reachable nodes.
>
>
> As discussed earlier we can add additional edges that
>> (transitively) connect loop to a virtual exit. Your proposal is to
>> connect loop -> exit:
>>
>>             start
>>                |
>>             first
>>              /   \
>> /\        /      bb
>> ^  \    /           \
>> ^    loop  ->   exit
>>  \__/
>>
>> Which means that we loose the original dominance property "bb
>> post-dominates first".
>
> I would not claim this is incorrect, but at least
>> imprecise.
>>
> I disagree, because it's the only way i see to maintain the path invariant.
>
>
>>                   start
>>                      |
>>     --------> first
>>   /                  /   \
>> ^     /\        /      bb
>> ^    ^  \    /           \
>> ^    ^    loop      exit
>> \     \__/|
>>   \         /
>>      -----
>>
>> Alternatively, we can connect loop to first, which preserves the
>> property that bb post-dominates first.
>>
>
> And breaks the path invariant in a number of cases.
>
> Example:
>

                  start
                     |
    --------> first
  /                  /   \
^     /\        /      bb
^    A  \    /           \
^    ^    loop      exit
\     \__/|
  \         /
    -----

For example, this claims that bb post-dominates A, loop, and first.

But there is no path from bb to any of those nodes.

The connection of loop to to exit properly gives that answer.
Your suggestion does not.
Again, you argue you don't care about the path invariant, because the
literature only applies it to reachable nodes.
So you suggest we simply ignore it for unreachable nodes.

That, to me, makes no sense.
If they are going to be part of the post-dom tree, they should satisfy the
invariants of the post-dom tree.

Anything else makes little sense to me ;)

>
>
>
>>
>> You explain that the latter solution is incorrect because "In order for
>> x to be a parent of y in the Dom/pdom tree, there must be a path (in the
>> walk direction) from x to y". This statement requires all blocks to be
>> reachable from the source. Without any augmenting edges, we cannot give
>> any statements about dominance relations covering "loop". When
>> artificial edges have been added, this property is true again.
>>
>> Yes.
>
>
>> Now, it seems that you have optimizations in mind which require such a
>> path to actually be executed.
>
>
>
No. It requires that the invariant be maintained to place things correctly,
we don't care if the paths are executed :)

>
>
>> I believe this is beyond what
>> post-dominance can actually guarantee.
>
>
>
> I disagree, and pointed you to specific literature that says otherwise :)
> You again, argue
>

You again, argue that being precise is more important than extending the
same invariants we get for reachable nodes, to unreachable nodes.

I think extending the same invariants is more important than any precision
we get.

>
>> Maybe what you are talking about
>> is strong post dominance?
>>
> nope.
>
>
>>
>> “Node u is strongly post-dominated by v, [...] iff (1) u
>> [post-dominates] v and (2) there is some integer k ≥ 1 s.t. every u−path
>> of length larger than or equal to k passes through v. Node v is a proper
>> strong post-dominator of u iff [u strongly post-dominates v] and u [not
>> equa]l v.”
>> (“Parametric and Termination-Sensitive Control Dependence”, SAS 2006,
>> http://fsl.cs.illinois.edu/pubs/chen-rosu-2006-sas.pdf,
>>   "A Formal Model of Program Dependences and Its Implications for
>>   Software Testing, Debugging, and Maintenance, Andy Podgurski, Lori A.
>>   Clarke, 1990")
>>
>> Most of the text above is just to give some background to others. I do
>> not expect you to answer in such detail. The only question I have if you
>> can give a specific example that would break with the alternative
>> solution I propose and which does not require strong post-dominance
>> (which is likely hard to guarantee)?
>>
> Yes.
ADCE breaks *right* now without the hack it has to try to work around this
(which is not correct in all cases, FWIW).
This is directly because of the path property. It also can break in some
cases because it's not sufficient to check just part of the path for the
discontinuity you introduce above.

SSUPRE will break  because it relies on the path property to optimize
stores properly.
In fact, i'm pretty much not aware of a single optimization relying on
post-dom info that will not break. I'm also positive i can't work around it.

To me, "precision" is not a good reason to have:
1. one set of path invariants for reachable nodes
2. a different one for unreachable nodes

Particularly when that set of invariants is precisely the set of invariants
the dominator/post-dominator tree are meant to guarantee.

As I said, if your specific analysis cares about the precision of the
post-dom tree for reachable nodes, without caring about unreachabkle node
i'm happy to give you that flag.
I just do not, by default, want to make every optimization have to work
around this issue.

I do not wish to have a post-dominator tree where some nodes in it meet the
invariant guarantees of post-dom trees, and some do not.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20170313/c09618d9/attachment.html>