[cfe-dev] [PATCH] expand a symbolic pointer

Ted Kremenek kremenek at apple.com
Wed Dec 10 17:43:06 PST 2008


On Dec 9, 2008, at 10:58 PM, Zhongxing Xu wrote:

> New patch.
>  - A new method CheckStructBase is taking care of the creation of  
> AnonPointeeRegion.
>  - A lightweight lvalue evaluation method is used to get the  
> location of the base expr
>    whose store may be changed.
>  - AnonPointeeRegion is now identified the the memregion of the  
> pointer pointing to it.

Nice work!  This is definitely going in the right direction!  Comments  
inline:

Index: include/clang/Analysis/PathSensitive/Store.h
===================================================================
--- include/clang/Analysis/PathSensitive/Store.h	(revision 60812)
+++ include/clang/Analysis/PathSensitive/Store.h	(working copy)
@@ -86,6 +86,12 @@
    ///  conversions between arrays and pointers.
    virtual SVal ArrayToPointer(SVal Array) = 0;

+  /// CheckStructBase - StoreManagers can conditionally expand a  
symbolic
+  /// pointer into a pointer to a struct region.
+  virtual Store CheckStructBase(Store store, SVal BaseV, SVal BaseL,  
bool NonNull){
+    return store;
+  }

Please add comments about what this method does.  The comment above  
the method is a little content-free.

Also please document what are the roles of 'BaseV' and 'BaseL'.  As  
you will see in my later comments, there roles aren't really all that  
clear to me, and I wonder if we can make this interface even simpler.

Moreover, why do we need the parameter 'NonNull'?  Why not just have  
GRExprEngine check if that value is false and then not call this  
method in the first place.  In other words, in the case where the  
pointer is null, GRExprEngine knows that the StoreManager isn't going  
to do anything.  This also simplifies the implementation (and  
interface) for StoreManager.

Finally, the name 'CheckStructBase' doesn't really mean anything to me.


Index: include/clang/Analysis/PathSensitive/GRExprEngine.h
===================================================================
--- include/clang/Analysis/PathSensitive/GRExprEngine.h	(revision 60812)
+++ include/clang/Analysis/PathSensitive/GRExprEngine.h	(working copy)
@@ -505,6 +505,9 @@
    /// a DeclRefExpr, it evaluates to the MemRegionVal which  
represents its
    /// storage location. Note that not all kinds of expressions has  
lvalue.
    void VisitLValue(Expr* Ex, NodeTy* Pred, NodeSet& Dst);
+
+  /// getLValue - A lightweight lvalue evaluation method.
+  SVal getLValue(const GRState* St, Expr* Ex);

The purpose of this method is not really clear from the comment.   
Please elaborate.  Moreover, it appears to only have one caller.

Index: include/clang/Analysis/PathSensitive/MemRegion.h
===================================================================
--- include/clang/Analysis/PathSensitive/MemRegion.h	(revision 60812)
+++ include/clang/Analysis/PathSensitive/MemRegion.h	(working copy)
@@ -237,20 +237,20 @@
  ///  parameters or pointer globals. In RegionStoreManager, we assume  
pointer
  ///  parameters or globals point at some anonymous region. Such  
regions are not
  ///  the regions associated with the pointer variables themselves.   
They are
-///  identified with the VarDecl of the pointer variable. We create  
them lazily.
+///  identified with the MemRegion of the pointer. We create them  
lazily.

  class AnonPointeeRegion : public TypedRegion {
    friend class MemRegionManager;
    // VD - the pointer variable that points at this region.
-  const VarDecl* Pointer;
+  const TypedRegion* Pointer;


All the AnonPointeeRegion stuff looks great!  You can probably just  
commit that stuff as a separate patch since it naturally subsumes the  
current logic.


Index: lib/Analysis/RegionStore.cpp
===================================================================
--- lib/Analysis/RegionStore.cpp	(revision 60812)
+++ lib/Analysis/RegionStore.cpp	(working copy)
@@ -101,6 +101,8 @@

    SVal ArrayToPointer(SVal Array);

+  Store CheckStructBase(Store store, SVal BaseV, SVal BaseL, bool  
NonNull);
+

Please put a comment here as well.  Again, what is the purpose of  
'NonNull'.  I don't think it's needed.




+Store RegionStoreManager::CheckStructBase(Store store, SVal BaseV,  
SVal BaseL,
+                                          bool NonNull) {
+  // Expand the symbolic pointer only when it can be NonNull.
+  if (isa<loc::SymbolVal>(BaseV) && NonNull) {
+    // Get the location that points to the struct.
+    loc::MemRegionVal& RV = cast<loc::MemRegionVal>(BaseL);
+
+    // Create an AnonPointeeRegion for it.
+    AnonPointeeRegion* APR =
+      MRMgr.getAnonPointeeRegion(cast<TypedRegion>(RV.getRegion()));
+
+    // Symbolize the struct region.
+    store = BindStructToSymVal(store, APR);
+
+    // Update the base pointer.
+    store = Bind(store, RV, Loc::MakeVal(APR));
+  }

Shouldn't BaseV always be symbolic?  (i.e., an assertion, not an 'if').

NonNull isn't needed.  We can simplify the interface and make it  
assumed.

Does this handle the case where a symbolic pointer to a struct has  
already had a region lazily created for it?

I'm also confused about the difference between BaseL and BaseV.  Don't  
we need just one?  It seems like the basic interface is that we pass  
in a SymbolicRegion.  No SVals at all.  The interface then gets  
simpler and more obvious.

Please document what CheckStructBase is, say its pre- and post-  
conditions (i.e., its interface), etc.


  SVal RegionStoreManager::Retrieve(const GRState* state, Loc L,  
QualType T) {
    assert(!isa<UnknownVal>(L) && "location unknown");
    assert(!isa<UndefinedVal>(L) && "location undefined");
Index: lib/Analysis/GRExprEngine.cpp
===================================================================
--- lib/Analysis/GRExprEngine.cpp	(revision 60812)
+++ lib/Analysis/GRExprEngine.cpp	(working copy)
@@ -425,6 +425,28 @@
    }
  }

+SVal GRExprEngine::getLValue(const GRState* St, Expr* Ex) {
+  switch (Ex->getStmtClass()) {
+  case Stmt::DeclRefExprClass: {
+    const VarDecl* VD = cast<VarDecl>(cast<DeclRefExpr>(Ex)- 
 >getDecl());
+    return StateMgr.GetLValue(St, VD);
+  }
+
+  case Stmt::UnaryOperatorClass: {
+    UnaryOperator* U = cast<UnaryOperator>(Ex);
+    if (U->getOpcode() == UnaryOperator::Deref) {
+      // The lvalue of *x is the rvalue of x, and it should be  
already evaluated.
+      return GetSVal(St, U->getSubExpr());
+    } else
+      return UndefinedVal();
+  }
+
+  default:
+    assert(0 && "Unhandled Stmt class.");
+    return UndefinedVal();
+  }
+}

Please add some comments to 'getLValue'.  Please clarifiy what purpose  
it serves (and for whom?).  Since that name is so clase to  
GRStateManager::getLValue(), its worth having more comments (since it  
does something a little different).  I'm not certain what it's role is  
compared to VisitLValue.

  void GRExprEngine::VisitLValue(Expr* Ex, NodeTy* Pred, NodeSet& Dst) {

    Ex = Ex->IgnoreParens();
@@ -935,6 +957,29 @@
      // FIXME: Should we insert some assumption logic in here to  
determine
      // if "Base" is a valid piece of memory?  Before we put this  
assumption
      // later when using FieldOffset lvals (which we no longer have).

Please add comments saying what this code is doing.

+
+    // Some kinds of struct base are mutable. Get base expr's location.
+    SVal BaseL;
+
+    if (M->isArrow()) {
+      BaseL = getLValue(St, Base);
+    } else {
+      Expr* TmpEx = Base->IgnoreParens();
+      // Check if the base expr is of (*p) pattern.
+      if (UnaryOperator* U = dyn_cast<UnaryOperator>(TmpEx)) {
+        assert(U->getOpcode() == UnaryOperator::Deref);
+        BaseL = getLValue(St, U->getSubExpr());
+      }

This peeking at the syntax of the subexpression doesn't look all that  
clean, but partially because I don't understand the role of BaseL and  
BaseV.  Couldn't the subexpression really be anything that returns an  
"lvalue"?  For example, what about:

   foo().f
   [myobject foo].f

I'm all about handling such things (e.g., the subexpression) with  
utility/recursive functions so we don't have to insert such grimy  
special cases into the main logic of the clients.

There also appears to be cases here where 'BaseL' is never assigned a  
value and left in the 'UnknownVal' state.  Is that intentional?  If  
so, please comment.

+    }
+
+    SVal BaseV = GetSVal(St, Base);

What is the difference between BaseL and BaseV?  This is a little  
confusing.

+
+    // Check if the base can be NonNull.
+    bool isFeasibleNotNull = false;
+    St = Assume(St, BaseV, true, isFeasibleNotNull);
+
+    St = StateMgr.CheckStructBase(St, BaseV, BaseL, Base,  
isFeasibleNotNull);
+

We should handle the two cases where 'BaseV' can be either null or not- 
null.  If 'BaseV' can be null, what do we want to return as the value  
of this expression?  For example, what does '&(x->f)' evaluate to when  
'x' is null?  Do we want to emit an error at this point (and stop) or  
do we want to compute the address of 0x0+offsetof(f).




More information about the cfe-dev mailing list