[PATCH] D36107: [PostDom] document the current handling of infinite loops and unreachables

Wed Aug 2 23:28:47 PDT 2017

Note: There is also a rewritten DSE that uses PostDom as well:
https://reviews.llvm.org/D29624

I will try to produce LLVM IR for you (i'm sadly finding less and less time
to work on LLVM these days, so i'm mostly having other people do it :P).

To give you a more described example  the following completely simple
algorithm (which GCC and a few other compilers use variants of, see
tree-ssa-dse.c) *should* work, but will give wrong answers if you ignore
infinite loops.
again, i've made it not catch every case so i can do it in <50 lines.
Normally, for example, there is more than one stack, instead of all
possible stores on the same stack, which misses a ton of cases due to
irrelevant aliasing, blah blah blah.

Stack StoreStack;  // In realer versions of this algorithm, we have
multiple stacks or other structures. In fact, in this version, there will
only ever be zero or one things on the stack.
void walkPostDomChildren(DomTreeNode *DTN)
{
  BasicBlock *BB = DTN->getBlock();
  if (!BB)
    return;
 // Ensure the that the store at the top of the stack is from something
above us in our tree.  Normally would be done by checking if  dfs numbers
of the current block is contained within the dfs in/out numbers of the top
of stack.   See newgvn.cpp, predicateinfo.cpp, etc for ValueStack, which
does this.

Pop Store Stack until block for instruction at  Stack.top() postdominates
DTN.

  for (auto *I : reverse(BB))
    if (auto *SI = dyn_cast<StoreInst>(I)) {
      // If whatever is on the top of the stack is a must alias and
post-dominates us, we're dead.
      if (!Stack.empty() &&
 AA.isMustAlias(Stack.top()->getPointerOperand(), I->getPointerOperand()) )
         I->eraseFromParent();
      else if (Stack.empty())
        Stack.push(SI)
      else
        // This is really a may-def, so we must clear the stack. In this
stupid version, this is true even if it noaliases top of stack.
        Stack.clear()
       continue;
   }
      if (!(getModRefInfo(I) & MRI_NoModRef))
        // A use of memory, so reset our stack
        Stack.clear();
  for (auto * Child : DTN)
       walkPostDomChildren(DTN)
}

walkPostDomChildren(PDT->getRoot())

Basically, if a store postdominates another earlier store without an
intervening may-use/may-def, the earlier store is dead.  There is nothing
that could possibly see it happen before the later store.

Modulo transcription bugs, this algorithm is provably correct for our
simplified assumptions. It will only eliminate stores that are dead.

Given the following program:

Block 1
store a, 5
if (foo)
{
Block 2:
     call foo(load a)
     goto Block 2
}
Block 3:
store a, 6

If infinite loops are not in the post-dominator tree, the tree will look
like this:

3
|
1

The algorithm will visit 3, then push store a, 6 on the stack, visit 1, and
use it to remove store a, 5
This is illegal.  The store a, 5, is visible and used in block 2, and block
2 in fact, calls a function with value we stored.

Whoops!

If infinite loops are connected to virtual exit, the tree will look like
this:

virtual exit
 /  |  \
3  2 1

We will no longer eliminate store a, 5, and we will get a correct answer
because at each point, nothing will be on the stack.
We will visit 3, push store a, 6. We will visit 2, pop the stack because
store a, 6 is no longer in scope.  Stack will stay empty throughout 2.
We will visit 1, push store a, 5.  We will exit the algorithm with this
store on the stack, having eliminated nothing.
We will get the correct answers no matter how many things no longer exit.
This is a safe and conservative post-dominator tree for our algorithm.

Is this perfect?  For certain, no.

In this example, for *this* algorithm, you can connect the infinite loop
anywhere you like as long as you do not generate the post-dominator tree:
3
|
1
|
2

or
3
|
2

If you generate that, you will eliminate the store a, 5 again, and you
should not.

So, for example:
virtual exit
 |     \
 3     \
 |       1
 2

Would also give a correct answer for this algorithm, on this example.  But
would be generally invalid because there is no reverse path from 3 to 2,
*and* there is a reverse path from 3 to 1, but 3 is not an ancestor of 1.

In fact, if the program was:
Block 1
store a, 5
if (foo)
{
Block 2:
     goto Block 2
}
Block 3:
store a, 6

we will still generate:
virtual exit
 /  |  \
3  2 1

but in practice, we could eliminate store a, 5 as it's not possible to see
that it didn't happen in this example.

The greater problem with not connecting the infinite loops to exit is that
"where it is legal to connect the loops to" (IE what doesn't break this
algorithm) is now complex.  Effectively, it must take into account where
the memory instructions are (though the exact conditions that must be
fulfilled do not have to be expressde like this)
Otherwise, i can generate an adversarial program that will give a wrong
answer when you apply the above algorithm.
IE by placing loads and stores and calls such that you end up with the same
problem as the 3->1->2 post dom tree above.

In particular, for the algorithm above, the places for where you connect
the infinite loops must ensure either the stack gets cleared in the right
place, or that the blocks are siblings instead of parent child, so the
scope gets reset.  (IE either we must always see the may-use in the
infinite loop and clear the stack before going above it, or we must pop the
stack so that we do not try to eliminate something "above" the infinite
loop using something "below" the infinite loop)

Worse than that, that's just what works with *this* algorithm.

SSUPRE, which requires a conservative correct postdominance frontier, has
worse requirements you'd have to prove are met.
So does Sreedhar and Gao's IDF algorithm (which is what IDF calculator
uses), which requires the CFG and PostDom tree be in sync, and postdom have
the parent and sibling property.
(This is solvable by making the real cfg have the set of fake edges you are
creating, but they'd need to be walked when you call predecessors.  If you
connect them all to a fake exit, even if you have algorithms that must know
about the edges, it's one block to special case instead of many)

Additionally, if you use post-dominance to prove speculation-safety, it's
now not enough to know where the load/stores are, you'd have to know where
every trapping instruction is to ensure the place you connect the infinite
loop does not imply speculation safety where it should not.

Essentially: Even if you completely ignore the sibling and parent
properties, connecting the infinite loops anywhere else requires enough
knowledge of the algorithms being used to be able to prove that you aren't
going to generate post-dom trees that make them wrong.
In practice, since at least the IDF calculator relies on the sibling and
parent properties, this ends up a practical requirement for the connection
points...

All that seems ... really fragile and difficult :)

On Wed, Aug 2, 2017 at 9:19 PM, Tobias Grosser <tobias.grosser at inf.ethz.ch>
wrote:

> On Thu, Aug 3, 2017, at 06:12, Tobias Grosser via llvm-commits wrote:
> > On Thu, Aug 3, 2017, at 02:19, Jakub Kuderski via Phabricator via
> > llvm-commits wrote:
> > > kuhar added a comment.
> > >
> > > I've just realized that I didn't notice the TODOs  in this patch when I
> > > did a code review for it.
> > >
> > >   // TODO: Can we change the PDT definition such that C remains part of
> > >   the
> > >   //       CFG, at best without loosing the dominance relation D
> postdom
> > >   B.
> > >
> > > I don't think we reached an agreement on the second part of the TODO,
> as
> > > the `"at best"` might imply. Could you reword the second part of the
> > > comment to be less ambiguous and not to suggest that?
> >
> > I can just drop it.There is no need to sneak in TODOs to reach an
> > agreement.
> > Hopefully our discussion is fruit
> >
> > > And a very minor nitpick: isn't it supposed to be s/loosing/losing? :)
>
> r309919
>
> Best,
> Tobias
> >
> > Best,
> > Tobias
> > _______________________________________________
> > llvm-commits mailing list
> > llvm-commits at lists.llvm.org
> > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20170802/f54d10b6/attachment.html>