[PATCH] Inverse post-order traversal for LiveVariables analysis, to recover the performance after r214064

Sat Aug 9 08:28:44 PDT 2014

Thanks for your comments Ted!

> It looks like we will do a complete iteration of the CFG blocks
> many times, each time we mark a skipped block.  Can we invert
> this loop?
>
>    for (CFG::const_iterator I = cfg->begin(), E = cfg->end(); I != E; ++I) {
>      if (Blocks.size() == size)
>        break;
>
> That seems far more efficient.  We have an invariant that once
> we visit a block we don't need to visit it again.  Inverting
> the loop makes this an O(n) operation, instead of possibly O(n^2).

My original motivation was that in most cases, all blocks would be reachable to the po_iterator, so that "while (Blocks.size() < size)" loop wouldn't be executed at all. That said, now I see how your suggestion allows simplifying this whole loop to

  // It may be that some blocks are inaccessible going from the CFG exit upwards
  // (e.g. infinite loops); we still need to add them.
  for (CFG::const_iterator I = cfg->begin(), E = cfg->end();
       (Blocks.size() < size) && (I != E); ++I) {
    const CFGBlock* block = *I;
    // Add a chain going upwards.
    while (!BlockOrder.count(block)) {
      Blocks.push_back(block);
      BlockOrder[block] = Blocks.size();
      CFGBlock::const_pred_iterator PI = block->pred_begin(),
                                    PE = block->pred_end();
      for (; PI != PE; ++PI) {
        const CFGBlock* pred = *PI;
        if (pred && !BlockOrder.count(pred)) {
          block = pred;
          break;
        }
      }
      // Chain ends when we couldn't find an unmapped pred.
      if (PI == PE) break;
    }
  }

> Also minor, but I'd prefer this:
>
>      BlockOrder[block] = Blocks.size() + 1;
>      Blocks.push_back(block);
>
> to be written as:
>
>      Blocks.push_back(block);
>      BlockOrder[block] = Blocks.size();
>
> That's more consistent (I think) with how we do these things elsewhere.

The first sequence is the same as how it was done in the original PostOrderCFGView constructor; but I don't mind changing the sequence in both places.

> I was thinking about this some more, and I'm wondering if enqueuing
> the unreachable blocks is the wrong approach.  The reality is that
> since these blocks are unreachable we don't even need to bother
> computing their dataflow values.  They will simply never be used.

They are "inverse-unreachable", i.e. the CFG exit is not reachable from them.

An illustrative example is rdar10308201() in test/Analysis/misc-ps.c

If we don't analyze such blocks (e.g. infinite loops), then variables defined outside of an infinite loop, and used inside the infinite loop, would be erroneously reported as dead.

> I'm also wondering about the following code in your patch:
> 
> class PostOrderInverseCFGView : public PostOrderCFGView {
>   typedef llvm::po_iterator<const CFG*, CFGBlockSet, true,
>                             llvm::GraphTraits<llvm::Inverse<const CFG*> >
>                            > po_iterator;
>
> It is true that this inverts the order, but I don't think this
> definition of po_iterator is used anywhere except the constructor
> for PostOrderInverseCFGView.

You're right, it's not; but equally the definition of PostOrderCFGView::po_iterator isn't used anywhere except the constructor for PostOrderCFGView: it's declared privately, and it's not used by any PostOrderCFGView consumers.

> There is no virtual definition of the 'dequeue' method, so
> 'dequeue' (which is declared in DataflowWorklistBase) uses
> the po_iterator defined in DataflowWorklistBase, which is
> the wrong definition for reverse analysis.

No, dequeue() uses PostOrderCFGView::iterator, which iterates over the Blocks vector. So, the only factor affecting the order of iteration is how the Blocks vector is initialized in the PostOrderInverseCFGView constructor.