[PATCH] Inverse post-order traversal for LiveVariables analysis, to recover the performance after r214064

Thu Aug 14 03:08:28 PDT 2014

Thank you Ted!

Attaching the updated patch for a final review.

Summary of changes:

* Comments updated to reflect the two possible CFG traversal orders
* PostOrderCFGView::po_iterator taken out of the header file
* Iteration order for PostOrderCFGView changed to "reverse inverse
post-order", the one required for a backward analysis
* ReversePostOrderCFGView created, with the same iteration order that
PostOrderCFGView used to have, the one required for a forward analysis
* The two previous consumers of PostOrderCFGView, ThreadSafetyCommon.h and
Consumed.cpp, switched to use ReversePostOrderCFGView
* DataflowWorklistBase renamed to DataflowWorklist, and the two
specializations named BackwardDataflowWorklist and ForwardDataflowWorklist

I believe this naming scheme matches the accepted terminology best.

-----Original Message-----
From: Ted Kremenek [mailto:kremenek at apple.com] 
Sent: 13 August 2014 21:59
To: Artyom Skrobov
Cc: Alexander Kornienko; cfe-commits at cs.uiuc.edu
Subject: Re: [PATCH] Inverse post-order traversal for LiveVariables
analysis, to recover the performance after r214064

> On Aug 13, 2014, at 11:24 AM, Artyom Skrobov <Artyom.Skrobov at arm.com>
wrote:
> 
>>>>> You're right, it's not; but equally the definition of
>>>>> PostOrderCFGView::po_iterator isn't used anywhere except the
>>>>> constructor for PostOrderCFGView: it's declared privately,
>>>>> and it's not used by any PostOrderCFGView consumers.
>>>> 
>>>> Ok, that makes sense now, but it makes me feel the APIs themselves
>>>> are a bit misleading/confusing.  (see below)
>>> 
>>> Would it help make things a bit clearer if we move this private
>>> typedef out of PostOrderCFGView declaration, into PostOrderCFGView
>>> constructor body? (And the same for the new PostOrderInverseCFGView)
>> 
>> It's actually a lot clearer to me now.  But there may be some
>> benefits in making this implementation detail truly private,
>> especially if it is used in only one place.  That would remove
>> some stuff from the header file, cleaning up the interface.
> 
> Good, I'll take this implementation detail out of the header file.
> 
> 
>>>> I see it now.  From an API perspective it is very confusing,
>>>> because the name "PostOrderCFGView" implies the iteration
>>>> always is in that order.
>>> 
>>> Well, that's true: PostOrderCFGView always does it in the
>>> "normal" order, and PostOrderInverseCFGView always does it
>>> in the inverse order.
>> 
>> The standard terminology is "forward" and "backwards" analysis.
>> "Normal" implies a bias, when there really isn't one.
>> 
>> The names are also not quite right.  A "forward" analysis uses
>> a "reverse postorder", while a "backwards" analysis uses a
>> "postorder" traversal.  Thus the names really should be:
>> 
>> PostOrderCFGView (for backwards analysis)
>> ReversePostOrderCFGView (for forwards analysis)
> 
> Yes, I see.
> 
> 
>> But I'm wondering about the refactoring here in your patch:
>> 
>> [...]
>> 
>> We're getting specialized behavior via subclassing, but I'm
>> wondering if we can make this simpler.  What if we just have
>> a "DataflowWorklist" class with one constructor that takes
>> an enum indicating a forward or backward analysis.  Based on
>> the enum we construct a PostOrderCFGView or a
>> ReverseOrderCFGView.  Both of those could subclass some common
>> class, say DataflowCFGView, that provides the iterator.
>> There is thus no need to duplicate DataflowWorklist.
> 
> This is possible, but what do we gain?
> How is constructing DataflowWorklist(*cfg, AC, DataflowWorklist::backward)
> simpler than constructing DataflowBackwardWorklist(*cfg, AC)?
> If I understand you correctly, the syntax for construction is the only
> difference between the two approaches.

Those are fair arguments.  I guess I'm thinking this approach would be
better because all of the specialization of behavior is with the CFG-block
ordering.  The subclasses of DataflowWorklistBase just replicate that
specialization, but they don't really add anything.  I'm fine with taking
this approach for now; if we want to refactor it later we always can.

> 
> 
>> Honestly, we may not need two classes for the CFG view either,
>> especially if all we are doing is putting the order in the
>> Blocks vector.  That seems like we just need a "CFGView" class
>> that initializes the Blocks vector based on the order
>> specified when the CFGView object is constructed.
> 
> PostOrderCFGView is implemented as a ManagedAnalysis, so it has to have a
> parameterless static method create(), to be run lazily and cached.
> I may be missing something, but I see no way to specify the iteration
order
> when constructing CFGView, to create two independent instances of it, and
to
> cache them with respect to their iteration order -- unless the two are
> implemented as separate classes.

Good point.

We could hypothetically extend mechanism there with more overloads to create
ManagedAnalysis objects with parameters, but that seems like overkill just
for this.  I agree we should go with the approach you propose.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: ReversePostOrderCFGView.patch
Type: application/octet-stream
Size: 12896 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20140814/2fff6303/attachment.obj>