[PATCH] Inverse post-order traversal for LiveVariables analysis, to recover the performance after r214064

Ted Kremenek kremenek at apple.com
Fri Aug 8 18:32:33 PDT 2014


Hi Artyom,

I have a question about this code:

>   // It may be that some blocks are inaccessible going from the CFG exit upwards
>   // (e.g. infinite loops); we still need to add them.
>   while (Blocks.size() < size) {
>     // Find a block that we skipped
>     CFG::const_iterator I = cfg->begin(), E = cfg->end();
>     for (; I != E; ++I)
>       if (!BlockOrder.count(*I))
>         break;
>     assert(I != E);
>     const CFGBlock* block = *I;
>     // Add a chain going upwards
>     while (!BlockOrder.count(block)) {
>       BlockOrder[block] = Blocks.size() + 1;
>       Blocks.push_back(block);
>       CFGBlock::const_pred_iterator PI = block->pred_begin(),
>                                     PE = block->pred_end();
>       for (; PI != PE; ++PI) {
>         const CFGBlock* pred = *PI;
>         if (pred && !BlockOrder.count(pred)) {
>           block = pred;
>           break;
>         }
>       }
>       // Chain ends when we couldn't find an unmapped pred
>       if (PI == PE) break;
>     }
>   }

It looks like we will do a complete iteration of the CFG blocks many times, each time we mark a skipped block.  Can we invert this loop?

    for (CFG::const_iterator I = cfg->begin(), E = cfg->end(); I != E; ++I) {
      if (Blocks.size() == size)
        break;

That seems far more efficient.  We have an invariant that once we visit a block we don't need to visit it again.  Inverting the loop makes this an O(n) operation, instead of possibly O(n^2).

Also minor, but I'd prefer this:

      BlockOrder[block] = Blocks.size() + 1;
      Blocks.push_back(block);

to be written as:

      Blocks.push_back(block);
      BlockOrder[block] = Blocks.size();

That's more consistent (I think) with how we do these things elsewhere.

Note that even this iteration of skipped blocks isn't completely precise; we may hit a block first that dominates other blocks that are all dead.

Perhaps a better approach is to include all unreachable blocks by querying the CFG for unreachable edges.  We now try and build the CFG so that all unreachable edges aren't hard pruned, but that you can access the pruned edge via an API call:

    /// Get the potentially unreachable block.
    CFGBlock *getPossiblyUnreachableBlock() const {
      return UnreachableBlock.getPointer();
    }

That said, we may not have converted the entire CFG to record these "ghost edges".  However, if you include the possibly unreachable blocks when you do your initial construction of Blocks, there will be less blocks that are skipped in the first go, and everything will be in the proper order in the worklist.

Also minor, please make all comments be sentences that end with a '.'

For expediency, besides the comments being slightly tweaked, I'm fine with the patch going in as is.  That would hopefully unblock Alex's current performance problem.  I do think we should invert the above loop though in a subsequent patch.

Ted


> On Aug 8, 2014, at 8:48 AM, Artyom Skrobov <Artyom.Skrobov at arm.com> wrote:
> 
> Thank you for the patience – indeed it took me some time to prepare the proper patch, which is now ready for review.
>  
> It comes without a regression test, as the only observable change is performance improvement of the analysis.
>  
>  
> From: Alexander Kornienko [mailto:alexfh at google.com <mailto:alexfh at google.com>] 
> Sent: 07 August 2014 23:35
> To: Artyom Skrobov
> Cc: cfe-commits at cs.uiuc.edu <mailto:cfe-commits at cs.uiuc.edu> Commits
> Subject: Re: r214064 - Factoring DataflowWorklist out of LiveVariables and UninitializedValues analyses
>  
> Any news here? This keeps biting us.
>  
> Maybe you can apply this patch first and work on a better solution after that?
> 
> On Tue, Aug 5, 2014 at 6:23 PM, Artyom Skrobov <Artyom.Skrobov at arm.com <mailto:Artyom.Skrobov at arm.com>> wrote:
> So, the simplest way to restore the original behaviour seems to be
>  
> Index: lib/Analysis/LiveVariables.cpp
> ===================================================================
> --- lib/Analysis/LiveVariables.cpp   (revision 214183)
> +++ lib/Analysis/LiveVariables.cpp   (working copy)
> @@ -451,6 +451,7 @@
>    // Construct the dataflow worklist.  Enqueue the exit block as the
>    // start of the analysis.
>    DataflowWorklist worklist(*cfg, AC);
> +  worklist.quick_hack();
>    llvm::BitVector everAnalyzedBlock(cfg->getNumBlockIDs());
>    // FIXME: we should enqueue using post order.
> Index: include/clang/Analysis/Analyses/DataflowWorklist.h
> ===================================================================
> --- include/clang/Analysis/Analyses/DataflowWorklist.h     (revision 214183)
> +++ include/clang/Analysis/Analyses/DataflowWorklist.h     (working copy)
> @@ -53,6 +53,7 @@
>    const CFGBlock *dequeue();
>    void sortWorklist();
> +  void quick_hack() { enqueuedBlocks.reset(); PO_I = PO_E; }
> };
>  } // end clang namespace
>  
> This has the effect of disabling the fallback post-order iteration.
>  
> The reason for the performance regression, however, is related to the iteration order over the CFG: UninitializedValues expects to use the default (i.e. reverse) PostOrderCFGView iteration, whereas LiveVariables expects the opposite order, corresponding to using llvm::GraphTraits<llvm::Inverse<const CFG*> > > as the last parameter in typedef po_iterator.
>  
> I’m currently thinking of a proper solution which would allow using PostOrderCFGView for iteration in either direction.
>  
>  
>  
> From: Alexander Kornienko [mailto:alexfh at google.com <mailto:alexfh at google.com>] 
> Sent: 05 August 2014 12:30
> 
> To: Artyom Skrobov
> Cc: cfe-commits at cs.uiuc.edu <mailto:cfe-commits at cs.uiuc.edu> Commits
> Subject: Re: r214064 - Factoring DataflowWorklist out of LiveVariables and UninitializedValues analyses
>  
> On another file from the same project the difference in my setup is 2 minutes before the patch vs. 50+ hours (I terminated the analyzer before it finished) after the patch. So it's a rather bad performance regression.
>  
> 
> <DataflowWorklistInverse.patch>_______________________________________________
> cfe-commits mailing list
> cfe-commits at cs.uiuc.edu <mailto:cfe-commits at cs.uiuc.edu>
> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits <http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20140808/042d2268/attachment.html>


More information about the cfe-commits mailing list