[cfe-dev] Removing "TransferFuncs" from the analyzer

Thu Jul 14 22:50:00 PDT 2011

On Jul 14, 2011, at 8:48 PM, Jordy Rose wrote:

> 
> Oops, right, I meant why the summaries have to be attached to ExplodedNodes -- the only time they're used once attached is for bug reports, and only in those three cases (-dealloc, MakeCollectable in non-GC, and retain/release in GC). Which, actually, is something we could change right now, by only attaching summaries if they have interesting ArgEffects. The actual RetainSummaryManager can still live on RetainReleaseChecker when everything blows over.
> 
> For AnnotatedGRState, I think if we're going to add new infrastructure this isn't a bad way to go. I would think Annotations would be a lot like a GRState's GDM, a wrapper around some kind of tag-keyed map (though it doesn't have to be immutable until it goes into an AnnotatedGRState.
> 
> How do annotations interact with ExplodedNodes being FoldingSetNodes? Are annotations merged, or do they make two different nodes unique? (If they make the nodes unique, does that mean all annotation data—which I'm currently thinking of like a GRState's GDM—has to be profile able?)

I was thinking something very simple.

First, the design is that annotations are sparse.  They rarely happen, so there is no reason to put them in the ExplodedNodes themselves.  I was just thinking of ExplodedGraph of having an side table from ExplodedNodes to (optional) annotations.  Basically, when ExplodedGraph::getNode() is called, it is passed an AnnotatedGRState.  An AnnotatedGRState is just a package for GRState and annotations (that are somehow managed).  Think of AnnotatedGRState as a value object, like SVal.  ExplodedGraph::getNode() then just unpacks the GRState, does what it does now, and if the AnnotatedGRState had an annotation we just plop it in the side table.

That said, you are right about ExplodedNodes being FoldingSetNodes.  Annotations really have more to do with the *transition* between two nodes, so perhaps the annotation should be associated with the edge between the predecessor and the successor node?  That way you can have multiple annotations associated with an ExplodedNode, but those annotations are contextual to the transition.  If we did this, then maybe ExplodedNode:;getNode() wouldn't reason about annotations at all.  Instead, Checker::addTransition() could call generate the new node, and then tell ExplodedGraph to add an annotation between the predecessor node (which CheckerContext knows about) and the new node.  The annotation would then go into a side table.

I actually really like this idea.  Right now CFRefCount (and possibly other checkers) need to walk the ExplodedGraph path to *infer* transition information.  Annotations just record that information for transition.

If we got clever with our representation, and we wanted the option to have annotations be lightweight but possibly prolific, we can possibly optimize how edge sets are represented in ExplodedNode to better store annotations.  I see that as an optimization problem.  I really believe that the majority of nodes, which are not checker generated, would not have annotations.

> 
> Lastly, I'm a little wary of using AnnotatedGRState to carry around a state+annotations, rather than just adding an Annotations parameter to getNode, but I guess it makes things easier for existing code. We'll probably have to look at all the callers of getNode either way if we make this change.

Understood.  My idea for AnnotatedGRState was to provide a way to accrue annotations while manipulating a GRState, but it really is just packaging.  I'm fine with not introducing a class like this for starters, and just working with some basic APIs as you suggest.  There is value in keeping the APIs simple, and building up what we actually need.

> 
> I guess the big question is whether or not this is actually necessary: will checkers need a way to associate information with certain nodes /besides/ RetainReleaseChecker? Probably yes, but there should probably be a guideline to avoid over-using annotations, putting off most of the work until you know there's a diagnostic.

I think this would be useful for a variety of checkers.  The retain release checker would make use of it, and so would a general malloc/free checker, or really any type state checker (including the PthreadLockChecker).  Type state checking is one of the most frequent checking we can do.

Another thing we could possibly provide are "implicit annotations."  For example, suppose we provide an API to query the annotations for an <ExplodedNode, ExplodedNode> pair.  Some of those could be explicitly recorded in a side table, and some of those could be lazily generated from a Checker via a callback.  For example, this is essentially what CFRefCount does: it reverse engineers state transitions when generating diagnostics.  For most cases it doesn't need the function summary (which could be captured by an explicit annotation recorded in the ExplodedGraph), and for those cases the annotation could be generated lazily and on-demand.  Mechanically, much of the logic in the RetainReleaseChecker for generating diagnostics would be the same, but it could be modularized based on a generic, and extensible annotation API for node transitions.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20110714/81bf27e2/attachment.html>