[cfe-dev] checkers: decoupling ValueManager from GRStateManager

Tue Oct 26 09:52:54 PDT 2010

On Oct 26, 2010, at 6:28 AM, Olaf Krzikalla wrote:

> Hi Ted,
> 
> > I think a good first step might be to better define the roles of
> > ValueManager and SValuator.  ValueManager should be used for
> > basic value construction and management, and SValuator for more
> > advanced things.
> OK. So SValuator relies on ValueManager, but (IMHO from here on) actually should not need StoreManager. The construction of even fancy SVals should not depend on the way they are bound to Regions.

SValuators currently rely on StoreManager because pointer arithmetic is handled by StoreManagers.  Thus the dependency between SValuators and StoreManagers.  I don't see this dependency going away, because all details of the memory model need to be handled by the StoreManager.

> Then the StoreManager classes can be build using an particular SValuator
> for the creation of SVals. In an ideal world even SValuator and the StoreManager base class don't need to use each other (as they are orthogonal).

StoreManagers have the flexibility to lazily construct values when queried for bindings.  StoreManagers need access to SValuator to have that kind of reasoning power.  Thus the reverse dependency.  I don't see this going away.

> However, currently all three classes/concepts are somehow cross-linked.

While a bit messy, this is actually intended.  While they implement separate concepts, they are designed to work together to implement the algorithmic components of the path-sensitive engine.  It turns out to implement one that we often need access to the other.  That's just how it is.  We of course should remove dependencies when they are no longer needed.

> Separating ValueManager seems to be the easiest task by moving convertToArrayIndex to SValuator. But you have to expose ArrayIndexWidth and ArrayIndexTy from ValueManager (two get-functions are enough, but I don't like growing interfaces). I'm not sure if these two values are always the same.

After thinking about this some more, I'm wondering if we should just go in the reverse direction.  Instead of splitting ValueManager completing off from SValuator, just integrate all of ValueManager into SValuator.  That way there is one unified place where SVals are constructed.

> If yes, then I could imagine to move these functions to ASTContext.

None of this should ever go into ASTContext.  ASTContext is part of libAST, which is has nothing to do with the analysis component of Clang.

> These two data members seem a little bit odd in ValueManager.
> Once this is done you get the problem that even the StoreManager base class needs to use the SValuator. StoreManager::CastRetrievedVal could be moved to SValuator (it creates a SVal, so it might belong to SValuator), but currently I have no idea how to deal with StoreManager::getLValueElement (which also creates an SVal, but is virtual).

Per my comment above, we inherently can't break StoreManager's dependency on SValuator.

> On the other hand the use of StoreManager should be removed from SValuator. StoreManager::CastRegion (and in turn StoreManager::MakeElementRegion) can become part of SValuator.

I agree that CastRegion can become part of SValuator (and we should do this), but the reasoning of pointer arithmetic needs to stay in the StoreManager.  With this constraint, I don't see how we can break the dependency.

> However, StoreManager::ArrayToPointer looks somewhat strange. Wouldn't it be possible to just move the implementation of RegionStoreManager to SValuator, thus doing always the "right" thing (Basic- and FlatStoreManager implement this function as a no-op).

Perhaps, but BasicStoreManager doesn't reason about arrays, so having it return UnknownVal is actually what we want it to do.  So this is actually by design, and BasicStoreManager::ArrayToPointer() is doing the right thing as far as itself is concerned.

This is how I see the breakdown.  StoreManagers should handle all the management of regions, as regions represent memory and their bindings.  The array to pointer decay is inherently part of how we reason about memory, and currently makes sense to keep in the StoreManager.  Different StoreManagers may wish to represent about this, well, differently, so that is why that method is virtual.

FlatStoreManager is an incomplete StoreManager that is only fractionally implemented, and is currently on hold.  You should honestly ignore it for now.  Where commonality between it and RegionStoreManager exists we can always address with future refactoring.

> Eventually we could end that first step with a layering: ValueManager, SValuator, StoreManager.
> OK, after these random thoughts some more specific comments:

The more I think about it, we should just remove ValueManager entirely.
> 
> 
>> The point of SValuator is to build about symbolic expression values.
> In understand that but I'm not convinced that a virtual interface is necessary. For me SValuator seemed to be an class which encapsulate some TODO things. Are there two different SValuator implementations thinkable (other than a basic vs. an enhanced one)?

Absolutely, and having the virtual interface is extremely important for three reasons:

(1) It allows us to always have a working, stable SValuator, while bringing up another one (which we just slide in).

(2) It allows us to do an apples-to-apples comparison of SValuators in terms of how they perform, their accuracy, etc.

(3) It conceivably allows users to use SValuators that are not part of libChecker.

My goal for SimpleSValuator is to keep it simple.  It has a very simplistic reasoning model that does pretty well, and is good for many analysis tasks.  However, it inherently can't reason about more complicated expressions.  For example:

  a = foo();
  b = bar();
  c = baz();
  return a + b + c;

Suppose 'a', 'b', and 'c' are all assigned symbolic values.  SimpleSValuator does not know how to construct an SVal for 'a + b + c'; it just evaluates it to UnknownVal.  While this could be a TODO for SimpleSValuator, at some point there is a limit to how much reasoning we want it to perform because we want to keep it simple.  Part of the reason for this is that the current ConstraintManagers can't reason about more complicated expressions, so there is no reason to incur more analysis work to get richer SVals for which we can't do anything meaningful with anyway.

In comparison, suppose we created a new SValuator that was built in tandem with a new ConstraintManager that could handle arbitrary linear arithmetic.  These two components (each which implement a virtual interface) could be plugged in together into GRExprEngine.  GRExprEngine would then transparently start building richer SVals, and then have a matching ConstraintManager that (transparently) can reason about them.  This new "analysis engine" would potentially be far slower, however, than the one based on SimpleSValuator.  In the cases where SimpleSValuator does just as well, then it is the clear winner.  Being able to tradeoff between different levels of analysis is the inherent strength of the virtual interface.

> 
>> I'm curious about what pieces you'd like to reuse from ValueManager with other dataflow analyses?
> Short answer: as much as possible. Ideally also SValuator and BasicStoreManager - thats why I want to remove/replace GRState from their interfaces. You might remember of our conversation here several month ago. Back then I ended up with a solution which was sufficient at that time. Now it is time to make things more complete. I have attached my current interface header file. As you can see, ValueFlowCollector could become simpler by just using ValueManager. And the use of SValuator and BasicStoreManager could simplify my implementation further. If I would take the code as it currently is, I would need a GRStateManager (and this would open a can of worms).

I think it might be easier for me to see where this is going if I had an idea of what you *don't* want to use.  The more this discussion proceeds, the more I don't believe that these pieces can be decoupled as much as you might like without putting some serious design restrictions into the path-sensitive engine.

Here's how I see things:

(1) ValueManager can be remove by simply integrating it into SValuator.  That would clean up a bunch of places in the code.

(2) StoreManager depends on SValuator.  This dependency is not going away (at least not anytime soon).

(3) SValuator depends on StoreManager.  This dependency is not going away unless we want to move all reasoning of pointer arithmetic to SValuator, but that's something I strongly don't want to do right now.

(4) The virtual interfaces in SValuator and StoreManager are not going away.

(5) It's not clear we can't remove SValuator's dependency on GRState without losing reasoning power.

I think (1-4) have been addressed earlier in this email, so I'll comment on (5).  Currently GRState* is passed to SValuator because SimpleSValuator uses methods from ConstraintManager (which it can reference via a GRState object).  Most of the methods in ConstraintManager depend on GRState because the GRState object has all the constraints used by the ConstraintManager.  In particular, SimpleSValuator uses ConstraintManager::getSymVal(), which is a virtual method whose result depends on the current GRState.  Thus I'm not really certain this dependency can be severed unless we sever the dependency between SValuator and ConstraintManager.

> 
>>> 4. In which direction is the checkers framework currently developed?
>> 
>> I need a little more context to understand this (fairly open-ended) question.  What do you mean by the "checkers framework?"  Are you referring to the Checker library?
> Yes, I mean the Checker library. Based on your other statements I conclude that separating things is apparently indeed one of the things currently in development. So the more concrete question is: shall I wait just another month or shall I actively work on the decoupling issue?

This kind of refactoring work is gradually done over time as people see a need or they otherwise have interest in doing the cleanup work.  Most activity on libChecker is on other areas right now (e.g., bringing up C++ support), and not refactoring these other pieces.  Nobody else aside from you has expressed interest in reusing these components in a different context, so you will probably need to drive this work.  That said, it's not clear to me where this is heading.  While I very much want the components of the analyzer to be reusable for other purposes, I'm not willing to sacrifice key aspects (many which I have mentioned) of the modular design of the path-sensitive engine.  It's also not clear to me what is the transitive closure of components that you are happy with using, since it seems you are interested in using both SValuator and the StoreManager (which inherently rely on a bunch of other stuff).

Anyhow, let's keep discussing this, and see what we can do.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20101026/2c59bded/attachment.html>