[PATCH] Inverse post-order traversal for LiveVariables analysis, to recover the performance after r214064

Thu Aug 14 08:35:32 PDT 2014

Looks great to me.

> On Aug 14, 2014, at 3:08 AM, Artyom Skrobov <Artyom.Skrobov at arm.com> wrote:
> 
> Thank you Ted!
> 
> Attaching the updated patch for a final review.
> 
> Summary of changes:
> 
> * Comments updated to reflect the two possible CFG traversal orders
> * PostOrderCFGView::po_iterator taken out of the header file
> * Iteration order for PostOrderCFGView changed to "reverse inverse
> post-order", the one required for a backward analysis
> * ReversePostOrderCFGView created, with the same iteration order that
> PostOrderCFGView used to have, the one required for a forward analysis
> * The two previous consumers of PostOrderCFGView, ThreadSafetyCommon.h and
> Consumed.cpp, switched to use ReversePostOrderCFGView
> * DataflowWorklistBase renamed to DataflowWorklist, and the two
> specializations named BackwardDataflowWorklist and ForwardDataflowWorklist
> 
> I believe this naming scheme matches the accepted terminology best.
> 
> 
> -----Original Message-----
> From: Ted Kremenek [mailto:kremenek at apple.com] 
> Sent: 13 August 2014 21:59
> To: Artyom Skrobov
> Cc: Alexander Kornienko; cfe-commits at cs.uiuc.edu
> Subject: Re: [PATCH] Inverse post-order traversal for LiveVariables
> analysis, to recover the performance after r214064
> 
> 
>> On Aug 13, 2014, at 11:24 AM, Artyom Skrobov <Artyom.Skrobov at arm.com>
> wrote:
>> 
>>>>>> You're right, it's not; but equally the definition of
>>>>>> PostOrderCFGView::po_iterator isn't used anywhere except the
>>>>>> constructor for PostOrderCFGView: it's declared privately,
>>>>>> and it's not used by any PostOrderCFGView consumers.
>>>>> 
>>>>> Ok, that makes sense now, but it makes me feel the APIs themselves
>>>>> are a bit misleading/confusing.  (see below)
>>>> 
>>>> Would it help make things a bit clearer if we move this private
>>>> typedef out of PostOrderCFGView declaration, into PostOrderCFGView
>>>> constructor body? (And the same for the new PostOrderInverseCFGView)
>>> 
>>> It's actually a lot clearer to me now.  But there may be some
>>> benefits in making this implementation detail truly private,
>>> especially if it is used in only one place.  That would remove
>>> some stuff from the header file, cleaning up the interface.
>> 
>> Good, I'll take this implementation detail out of the header file.
>> 
>> 
>>>>> I see it now.  From an API perspective it is very confusing,
>>>>> because the name "PostOrderCFGView" implies the iteration
>>>>> always is in that order.
>>>> 
>>>> Well, that's true: PostOrderCFGView always does it in the
>>>> "normal" order, and PostOrderInverseCFGView always does it
>>>> in the inverse order.
>>> 
>>> The standard terminology is "forward" and "backwards" analysis.
>>> "Normal" implies a bias, when there really isn't one.
>>> 
>>> The names are also not quite right.  A "forward" analysis uses
>>> a "reverse postorder", while a "backwards" analysis uses a
>>> "postorder" traversal.  Thus the names really should be:
>>> 
>>> PostOrderCFGView (for backwards analysis)
>>> ReversePostOrderCFGView (for forwards analysis)
>> 
>> Yes, I see.
>> 
>> 
>>> But I'm wondering about the refactoring here in your patch:
>>> 
>>> [...]
>>> 
>>> We're getting specialized behavior via subclassing, but I'm
>>> wondering if we can make this simpler.  What if we just have
>>> a "DataflowWorklist" class with one constructor that takes
>>> an enum indicating a forward or backward analysis.  Based on
>>> the enum we construct a PostOrderCFGView or a
>>> ReverseOrderCFGView.  Both of those could subclass some common
>>> class, say DataflowCFGView, that provides the iterator.
>>> There is thus no need to duplicate DataflowWorklist.
>> 
>> This is possible, but what do we gain?
>> How is constructing DataflowWorklist(*cfg, AC, DataflowWorklist::backward)
>> simpler than constructing DataflowBackwardWorklist(*cfg, AC)?
>> If I understand you correctly, the syntax for construction is the only
>> difference between the two approaches.
> 
> Those are fair arguments.  I guess I'm thinking this approach would be
> better because all of the specialization of behavior is with the CFG-block
> ordering.  The subclasses of DataflowWorklistBase just replicate that
> specialization, but they don't really add anything.  I'm fine with taking
> this approach for now; if we want to refactor it later we always can.
> 
>> 
>> 
>>> Honestly, we may not need two classes for the CFG view either,
>>> especially if all we are doing is putting the order in the
>>> Blocks vector.  That seems like we just need a "CFGView" class
>>> that initializes the Blocks vector based on the order
>>> specified when the CFGView object is constructed.
>> 
>> PostOrderCFGView is implemented as a ManagedAnalysis, so it has to have a
>> parameterless static method create(), to be run lazily and cached.
>> I may be missing something, but I see no way to specify the iteration
> order
>> when constructing CFGView, to create two independent instances of it, and
> to
>> cache them with respect to their iteration order -- unless the two are
>> implemented as separate classes.
> 
> Good point.
> 
> We could hypothetically extend mechanism there with more overloads to create
> ManagedAnalysis objects with parameters, but that seems like overkill just
> for this.  I agree we should go with the approach you propose.
> 
> <ReversePostOrderCFGView.patch>