[cfe-dev] [PATCH] expand a symbolic pointer

Zhongxing Xu xuzhongxing at gmail.com
Wed Dec 10 20:06:32 PST 2008


Thanks for your comments. They were helpful. Now the interfaces get
simplified a lot. Variables and functions are commented well.

Here I explain the main idea.

Some struct base is not mutable, e.g., struct s a; a.f = 3; Here the struct
base 'a' is not mutable. But some can, e.g.,

void foo(struct s* p) {
  p->f = 3
}

When we getInitialStore(), p is initialized to have value loc::SymbolVal,
which I call a symbolic pointer. (Note: here no symbolic region is
involved).

In this patch we try to concretize the struct pointer p. We have several
preconditions for this concretization:
 - the struct base is a pointer
 - the pointer is symbolic
 - the pointer is an lvalue, so that it can be modified.

When concretizing the struct base pointer, we pass the pointer's location to
StoreManager. Therefore the pointer should be an lvalue and has a location.
We get this location by calling
GRExprEngine::getBasePtrLoc(). This method only handles a few cases. In
other cases, BasePtrLoc is left Undefined intentionally.

These are the main ideas. Other issues were commented in the patch

On Thu, Dec 11, 2008 at 9:43 AM, Ted Kremenek <kremenek at apple.com> wrote:

>
> On Dec 9, 2008, at 10:58 PM, Zhongxing Xu wrote:
>
>  New patch.
>>  - A new method CheckStructBase is taking care of the creation of
>> AnonPointeeRegion.
>>  - A lightweight lvalue evaluation method is used to get the location of
>> the base expr
>>   whose store may be changed.
>>  - AnonPointeeRegion is now identified the the memregion of the pointer
>> pointing to it.
>>
>
> Nice work!  This is definitely going in the right direction!  Comments
> inline:
>
> Index: include/clang/Analysis/PathSensitive/Store.h
> ===================================================================
> --- include/clang/Analysis/PathSensitive/Store.h        (revision 60812)
> +++ include/clang/Analysis/PathSensitive/Store.h        (working copy)
> @@ -86,6 +86,12 @@
>   ///  conversions between arrays and pointers.
>   virtual SVal ArrayToPointer(SVal Array) = 0;
>
> +  /// CheckStructBase - StoreManagers can conditionally expand a symbolic
> +  /// pointer into a pointer to a struct region.
> +  virtual Store CheckStructBase(Store store, SVal BaseV, SVal BaseL, bool
> NonNull){
> +    return store;
> +  }
>
> Please add comments about what this method does.  The comment above the
> method is a little content-free.
>
> Also please document what are the roles of 'BaseV' and 'BaseL'.  As you
> will see in my later comments, there roles aren't really all that clear to
> me, and I wonder if we can make this interface even simpler.
>
> Moreover, why do we need the parameter 'NonNull'?  Why not just have
> GRExprEngine check if that value is false and then not call this method in
> the first place.  In other words, in the case where the pointer is null,
> GRExprEngine knows that the StoreManager isn't going to do anything.  This
> also simplifies the implementation (and interface) for StoreManager.
>
> Finally, the name 'CheckStructBase' doesn't really mean anything to me.
>
>
> Index: include/clang/Analysis/PathSensitive/GRExprEngine.h
> ===================================================================
> --- include/clang/Analysis/PathSensitive/GRExprEngine.h (revision 60812)
> +++ include/clang/Analysis/PathSensitive/GRExprEngine.h (working copy)
> @@ -505,6 +505,9 @@
>   /// a DeclRefExpr, it evaluates to the MemRegionVal which represents its
>   /// storage location. Note that not all kinds of expressions has lvalue.
>   void VisitLValue(Expr* Ex, NodeTy* Pred, NodeSet& Dst);
> +
> +  /// getLValue - A lightweight lvalue evaluation method.
> +  SVal getLValue(const GRState* St, Expr* Ex);
>
> The purpose of this method is not really clear from the comment.  Please
> elaborate.  Moreover, it appears to only have one caller.
>
> Index: include/clang/Analysis/PathSensitive/MemRegion.h
> ===================================================================
> --- include/clang/Analysis/PathSensitive/MemRegion.h    (revision 60812)
> +++ include/clang/Analysis/PathSensitive/MemRegion.h    (working copy)
> @@ -237,20 +237,20 @@
>  ///  parameters or pointer globals. In RegionStoreManager, we assume
> pointer
>  ///  parameters or globals point at some anonymous region. Such regions
> are not
>  ///  the regions associated with the pointer variables themselves.  They
> are
> -///  identified with the VarDecl of the pointer variable. We create them
> lazily.
> +///  identified with the MemRegion of the pointer. We create them lazily.
>
>  class AnonPointeeRegion : public TypedRegion {
>   friend class MemRegionManager;
>   // VD - the pointer variable that points at this region.
> -  const VarDecl* Pointer;
> +  const TypedRegion* Pointer;
>
>
> All the AnonPointeeRegion stuff looks great!  You can probably just commit
> that stuff as a separate patch since it naturally subsumes the current
> logic.
>
>
> Index: lib/Analysis/RegionStore.cpp
> ===================================================================
> --- lib/Analysis/RegionStore.cpp        (revision 60812)
> +++ lib/Analysis/RegionStore.cpp        (working copy)
> @@ -101,6 +101,8 @@
>
>   SVal ArrayToPointer(SVal Array);
>
> +  Store CheckStructBase(Store store, SVal BaseV, SVal BaseL, bool
> NonNull);
> +
>
> Please put a comment here as well.  Again, what is the purpose of
> 'NonNull'.  I don't think it's needed.
>
>
>
>
> +Store RegionStoreManager::CheckStructBase(Store store, SVal BaseV, SVal
> BaseL,
> +                                          bool NonNull) {
> +  // Expand the symbolic pointer only when it can be NonNull.
> +  if (isa<loc::SymbolVal>(BaseV) && NonNull) {
> +    // Get the location that points to the struct.
> +    loc::MemRegionVal& RV = cast<loc::MemRegionVal>(BaseL);
> +
> +    // Create an AnonPointeeRegion for it.
> +    AnonPointeeRegion* APR =
> +      MRMgr.getAnonPointeeRegion(cast<TypedRegion>(RV.getRegion()));
> +
> +    // Symbolize the struct region.
> +    store = BindStructToSymVal(store, APR);
> +
> +    // Update the base pointer.
> +    store = Bind(store, RV, Loc::MakeVal(APR));
> +  }
>
> Shouldn't BaseV always be symbolic?  (i.e., an assertion, not an 'if').
>
> NonNull isn't needed.  We can simplify the interface and make it assumed.
>
> Does this handle the case where a symbolic pointer to a struct has already
> had a region lazily created for it?
>
> I'm also confused about the difference between BaseL and BaseV.  Don't we
> need just one?  It seems like the basic interface is that we pass in a
> SymbolicRegion.  No SVals at all.  The interface then gets simpler and more
> obvious.
>
> Please document what CheckStructBase is, say its pre- and post- conditions
> (i.e., its interface), etc.
>
>
>  SVal RegionStoreManager::Retrieve(const GRState* state, Loc L, QualType T)
> {
>   assert(!isa<UnknownVal>(L) && "location unknown");
>   assert(!isa<UndefinedVal>(L) && "location undefined");
> Index: lib/Analysis/GRExprEngine.cpp
> ===================================================================
> --- lib/Analysis/GRExprEngine.cpp       (revision 60812)
> +++ lib/Analysis/GRExprEngine.cpp       (working copy)
> @@ -425,6 +425,28 @@
>   }
>  }
>
> +SVal GRExprEngine::getLValue(const GRState* St, Expr* Ex) {
> +  switch (Ex->getStmtClass()) {
> +  case Stmt::DeclRefExprClass: {
> +    const VarDecl* VD = cast<VarDecl>(cast<DeclRefExpr>(Ex)->getDecl());
> +    return StateMgr.GetLValue(St, VD);
> +  }
> +
> +  case Stmt::UnaryOperatorClass: {
> +    UnaryOperator* U = cast<UnaryOperator>(Ex);
> +    if (U->getOpcode() == UnaryOperator::Deref) {
> +      // The lvalue of *x is the rvalue of x, and it should be already
> evaluated.
> +      return GetSVal(St, U->getSubExpr());
> +    } else
> +      return UndefinedVal();
> +  }
> +
> +  default:
> +    assert(0 && "Unhandled Stmt class.");
> +    return UndefinedVal();
> +  }
> +}
>
> Please add some comments to 'getLValue'.  Please clarifiy what purpose it
> serves (and for whom?).  Since that name is so clase to
> GRStateManager::getLValue(), its worth having more comments (since it does
> something a little different).  I'm not certain what it's role is compared
> to VisitLValue.
>
>  void GRExprEngine::VisitLValue(Expr* Ex, NodeTy* Pred, NodeSet& Dst) {
>
>   Ex = Ex->IgnoreParens();
> @@ -935,6 +957,29 @@
>     // FIXME: Should we insert some assumption logic in here to determine
>     // if "Base" is a valid piece of memory?  Before we put this assumption
>     // later when using FieldOffset lvals (which we no longer have).
>
> Please add comments saying what this code is doing.
>
> +
> +    // Some kinds of struct base are mutable. Get base expr's location.
> +    SVal BaseL;
> +
> +    if (M->isArrow()) {
> +      BaseL = getLValue(St, Base);
> +    } else {
> +      Expr* TmpEx = Base->IgnoreParens();
> +      // Check if the base expr is of (*p) pattern.
> +      if (UnaryOperator* U = dyn_cast<UnaryOperator>(TmpEx)) {
> +        assert(U->getOpcode() == UnaryOperator::Deref);
> +        BaseL = getLValue(St, U->getSubExpr());
> +      }
>
> This peeking at the syntax of the subexpression doesn't look all that
> clean, but partially because I don't understand the role of BaseL and BaseV.
>  Couldn't the subexpression really be anything that returns an "lvalue"?
>  For example, what about:
>
>  foo().f
>  [myobject foo].f
>
> I'm all about handling such things (e.g., the subexpression) with
> utility/recursive functions so we don't have to insert such grimy special
> cases into the main logic of the clients.
>
> There also appears to be cases here where 'BaseL' is never assigned a value
> and left in the 'UnknownVal' state.  Is that intentional?  If so, please
> comment.
>
> +    }
> +
> +    SVal BaseV = GetSVal(St, Base);
>
> What is the difference between BaseL and BaseV?  This is a little
> confusing.
>
> +
> +    // Check if the base can be NonNull.
> +    bool isFeasibleNotNull = false;
> +    St = Assume(St, BaseV, true, isFeasibleNotNull);
> +
> +    St = StateMgr.CheckStructBase(St, BaseV, BaseL, Base,
> isFeasibleNotNull);
> +
>
> We should handle the two cases where 'BaseV' can be either null or
> not-null.  If 'BaseV' can be null, what do we want to return as the value of
> this expression?  For example, what does '&(x->f)' evaluate to when 'x' is
> null?  Do we want to emit an error at this point (and stop) or do we want to
> compute the address of 0x0+offsetof(f).
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20081211/6aea8bec/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: concretize.patch
Type: application/octet-stream
Size: 6837 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20081211/6aea8bec/attachment.obj>


More information about the cfe-dev mailing list