[PATCH] D35851: [Dominators] Include infinite loops in PostDominatorTree

Tobias Grosser via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Aug 2 14:05:42 PDT 2017


grosser added a comment.

In https://reviews.llvm.org/D35851#829611, @dberlin wrote:

>


Hi daniel,

thank you for adding some more context! Let me fire a quick reply asking some questions before looking deeper into your points!

>> While I currently do not have any benchmark where optimizing infinite loops is useful (is there one I could have a look?), let's working under the assumption that it actually is. Chandler and Daniel raised the point that just the fact that the PDT contains all nodes makes user code simpler, which is another argument for this kind of modeling. In general, both points together make a good motivation to include all nodes in the PDT. I agree, assuming we understand and are happy with the resulting implications.
> 
> FWIW, here is some more literature that assumes you have a single entry/exit node in postdom.
>  http://ssabook.gforge.inria.fr/latest/book.pdf

Fabrice's book, very cool indeed!

> The PDG building algorithm (whihc is the standard one, AFAIK, but you are probably more familiar with PDG than i am): ".. Similar to the CFG, a PDG also has two nodes ENTRY and EXIT,
>  through which control flow enters and exits the program respectively. For this
>  purpose, it is assumed that every node of the CFG is reachable from the entry
>  node and can reach the exit node."
> 
> Most *optimizations* that use post-dominators have this assumption hidden in them, stated or not.
> 
> If you stare at 12.5.3 in that book, which presents a simplified explanation of SSUPRE.
> 
> If you don't include infinite loops and connect them to exit, the sink points will be wrong.  We have an impl of SSUPRE posted for review, i think you can actually get it to happen for real with it (but i haven't had time to play)

Could you point me to the phabricator link.  A quick search for SSUPRE did not bring it up.

> IE given a trivial variation on their program, to try to make it easy to follow
> 
>   regular loop {
>     store a
>     if (foo) {
>       infinite loop:
>         load a, do something with it
>       exit program or go to infinite loop
>     } 
>   }
>   load a
>   
> 
> 
> The goal of this program is that the blocks for infinite loop do not appear in the postdominator tree currently or are not connected to exit (hopefully it is trivial to see that it is possible to accomplish this given the current root finding, but again, i have not run it through clang) 
>  Without infinite loops included or connected to exit, it will generate:
> 
>   regular loop {
>     if (foo) {
>       infinite loop:
>         load a, do something with it
>       exit program or go to infinite loop
>     } 
>   }
>   store a
>   load a
>   
> 
> 
> (Note that this load is deliberately here to cause it to store back.  If it wasn't here, and instead you just had "use <value stored>", it would eliminate the store in favor of a temporary, registerizing the memory access)
> 
> This is, however, wrong, as now the load in the infinite loop does not see the store.
>  IE It will never consider the load in the infinite loop as a use that it must consider when placing factoring points (as it will not appear in the postdominance frontier). It will see this as a full redundancy.
> 
> With infinite loops present and connected, it will do this:
> 
>   regular loop {
>     if (foo) {
>       infinite loop:
>         store a, value
>         load a, do something with it
>       exit program or go to infinite loop
>     } 
>   }
>   store a, value
>   load a
>   

I need to think about these. Some LLVM-IR unit tests that explain the property you are looking for would make the discussion here more concrete, as I will need some time to translate your textual ideas to IR. Even without, I certainly will have a look at your pointers.

>> The same question as above: what prevents you to hoist these very instructions in the original program with a dynamically infinite loop?
> 
> post-dominance alone is definitely not sufficient to guarantee safety in general for either hoisting or sinking.

I fully agree that post-dominance alone is not sufficient. This seems contrary to the impression @kuhar's reply made to me. I need to think more about your answer below, but a more direct answer to the questions I asked will certainly help to clarify some of what has been discussed before. @kuhar can you help me understand if some of the previous examples make good test cases for the hoisting transformation you have in mind.

Overall, I am really not trying to shoot down this proposal. Even though I would like to have a PDT that works well with regions, if we can come up together with good test cases that show this is not possible, I am the first to commit and document them! Please help me to get your requirements written into nice understandable unit tests! If our discussions end up in useful test cases and documentation, then it was certainly of use.

> It's often a necessary but not sufficient condition for the hoisting sinking, and is usually used to ensure non-speculation of memory along new paths.
>  In particular: does this load post-dominate all other loads that are the same is the trivial version of this (if so, you are not speculating this).  LLVM currently hacks around this using  walking some expensive testing that could be replaced with O(1) answers based on post-dominance.  That would be "is the later load in a cdeq set of an earlier load".   If so, they must execute together or not at all.
> 
> However, as it is also used for elimination and elimination based sinking as well
>  As another example: In a function without any loads (to keep it simple), a later store that post-dominates all blocks containing other stores makes all the earlier stores redundant.
> 
> IE
> 
> if (foo)
>  store a
> else
>  store a
> 
> store a
> 
> the later store post-dominating all the other stores is sufficient to eliminate them[1]
> 
> this also fails to be true in certain conditions if you either don't include the loops, or don't connect them to exit.
>  The important part is that the loop appears, and that it does not appear In the "wrong" part of the tree. As a trivial example why, if you add an infinite loop on a branch that, say, calls a global function that can see the store, you need to ensure we don't delete that store by seeing a wrong relationship.  So the infinite loop can only be connected to blocks that will not cause us to believe that the later store post-dominates all uses.
>  It is certainly possible, depending on the location of stores/loads/etc, to connect the infinite loops somewhere else, and it may not matter for regions, but depending on the locations, it may affect store elimination.

Again, I will need to think about the above and will try to come up with LLVM-IR test cases that illustrate the problem you describe. If you happen to have some available, this would be very much appreciated.

Best,
Tobias

> [1] This notion is really is the basis of the SSUPRE transformation.
>  If you view the stores and aliased loads as "uses", it's basically "place a store where it postdominates all 'uses'. Eliminate any earlier 'uses' that are stores. Insert a store before any 'use' that is a load".
>  again, this is not a *perfect* description, because SSUPRE also considers lifetime optimality, and only tries to sink them as far as necessary to remove redundancy, instead of "as far down as it can".




https://reviews.llvm.org/D35851





More information about the llvm-commits mailing list