[cfe-dev] [PATCH] expand a symbolic pointer
Ted Kremenek
kremenek at apple.com
Wed Dec 10 17:43:06 PST 2008
On Dec 9, 2008, at 10:58 PM, Zhongxing Xu wrote:
> New patch.
> - A new method CheckStructBase is taking care of the creation of
> AnonPointeeRegion.
> - A lightweight lvalue evaluation method is used to get the
> location of the base expr
> whose store may be changed.
> - AnonPointeeRegion is now identified the the memregion of the
> pointer pointing to it.
Nice work! This is definitely going in the right direction! Comments
inline:
Index: include/clang/Analysis/PathSensitive/Store.h
===================================================================
--- include/clang/Analysis/PathSensitive/Store.h (revision 60812)
+++ include/clang/Analysis/PathSensitive/Store.h (working copy)
@@ -86,6 +86,12 @@
/// conversions between arrays and pointers.
virtual SVal ArrayToPointer(SVal Array) = 0;
+ /// CheckStructBase - StoreManagers can conditionally expand a
symbolic
+ /// pointer into a pointer to a struct region.
+ virtual Store CheckStructBase(Store store, SVal BaseV, SVal BaseL,
bool NonNull){
+ return store;
+ }
Please add comments about what this method does. The comment above
the method is a little content-free.
Also please document what are the roles of 'BaseV' and 'BaseL'. As
you will see in my later comments, there roles aren't really all that
clear to me, and I wonder if we can make this interface even simpler.
Moreover, why do we need the parameter 'NonNull'? Why not just have
GRExprEngine check if that value is false and then not call this
method in the first place. In other words, in the case where the
pointer is null, GRExprEngine knows that the StoreManager isn't going
to do anything. This also simplifies the implementation (and
interface) for StoreManager.
Finally, the name 'CheckStructBase' doesn't really mean anything to me.
Index: include/clang/Analysis/PathSensitive/GRExprEngine.h
===================================================================
--- include/clang/Analysis/PathSensitive/GRExprEngine.h (revision 60812)
+++ include/clang/Analysis/PathSensitive/GRExprEngine.h (working copy)
@@ -505,6 +505,9 @@
/// a DeclRefExpr, it evaluates to the MemRegionVal which
represents its
/// storage location. Note that not all kinds of expressions has
lvalue.
void VisitLValue(Expr* Ex, NodeTy* Pred, NodeSet& Dst);
+
+ /// getLValue - A lightweight lvalue evaluation method.
+ SVal getLValue(const GRState* St, Expr* Ex);
The purpose of this method is not really clear from the comment.
Please elaborate. Moreover, it appears to only have one caller.
Index: include/clang/Analysis/PathSensitive/MemRegion.h
===================================================================
--- include/clang/Analysis/PathSensitive/MemRegion.h (revision 60812)
+++ include/clang/Analysis/PathSensitive/MemRegion.h (working copy)
@@ -237,20 +237,20 @@
/// parameters or pointer globals. In RegionStoreManager, we assume
pointer
/// parameters or globals point at some anonymous region. Such
regions are not
/// the regions associated with the pointer variables themselves.
They are
-/// identified with the VarDecl of the pointer variable. We create
them lazily.
+/// identified with the MemRegion of the pointer. We create them
lazily.
class AnonPointeeRegion : public TypedRegion {
friend class MemRegionManager;
// VD - the pointer variable that points at this region.
- const VarDecl* Pointer;
+ const TypedRegion* Pointer;
All the AnonPointeeRegion stuff looks great! You can probably just
commit that stuff as a separate patch since it naturally subsumes the
current logic.
Index: lib/Analysis/RegionStore.cpp
===================================================================
--- lib/Analysis/RegionStore.cpp (revision 60812)
+++ lib/Analysis/RegionStore.cpp (working copy)
@@ -101,6 +101,8 @@
SVal ArrayToPointer(SVal Array);
+ Store CheckStructBase(Store store, SVal BaseV, SVal BaseL, bool
NonNull);
+
Please put a comment here as well. Again, what is the purpose of
'NonNull'. I don't think it's needed.
+Store RegionStoreManager::CheckStructBase(Store store, SVal BaseV,
SVal BaseL,
+ bool NonNull) {
+ // Expand the symbolic pointer only when it can be NonNull.
+ if (isa<loc::SymbolVal>(BaseV) && NonNull) {
+ // Get the location that points to the struct.
+ loc::MemRegionVal& RV = cast<loc::MemRegionVal>(BaseL);
+
+ // Create an AnonPointeeRegion for it.
+ AnonPointeeRegion* APR =
+ MRMgr.getAnonPointeeRegion(cast<TypedRegion>(RV.getRegion()));
+
+ // Symbolize the struct region.
+ store = BindStructToSymVal(store, APR);
+
+ // Update the base pointer.
+ store = Bind(store, RV, Loc::MakeVal(APR));
+ }
Shouldn't BaseV always be symbolic? (i.e., an assertion, not an 'if').
NonNull isn't needed. We can simplify the interface and make it
assumed.
Does this handle the case where a symbolic pointer to a struct has
already had a region lazily created for it?
I'm also confused about the difference between BaseL and BaseV. Don't
we need just one? It seems like the basic interface is that we pass
in a SymbolicRegion. No SVals at all. The interface then gets
simpler and more obvious.
Please document what CheckStructBase is, say its pre- and post-
conditions (i.e., its interface), etc.
SVal RegionStoreManager::Retrieve(const GRState* state, Loc L,
QualType T) {
assert(!isa<UnknownVal>(L) && "location unknown");
assert(!isa<UndefinedVal>(L) && "location undefined");
Index: lib/Analysis/GRExprEngine.cpp
===================================================================
--- lib/Analysis/GRExprEngine.cpp (revision 60812)
+++ lib/Analysis/GRExprEngine.cpp (working copy)
@@ -425,6 +425,28 @@
}
}
+SVal GRExprEngine::getLValue(const GRState* St, Expr* Ex) {
+ switch (Ex->getStmtClass()) {
+ case Stmt::DeclRefExprClass: {
+ const VarDecl* VD = cast<VarDecl>(cast<DeclRefExpr>(Ex)-
>getDecl());
+ return StateMgr.GetLValue(St, VD);
+ }
+
+ case Stmt::UnaryOperatorClass: {
+ UnaryOperator* U = cast<UnaryOperator>(Ex);
+ if (U->getOpcode() == UnaryOperator::Deref) {
+ // The lvalue of *x is the rvalue of x, and it should be
already evaluated.
+ return GetSVal(St, U->getSubExpr());
+ } else
+ return UndefinedVal();
+ }
+
+ default:
+ assert(0 && "Unhandled Stmt class.");
+ return UndefinedVal();
+ }
+}
Please add some comments to 'getLValue'. Please clarifiy what purpose
it serves (and for whom?). Since that name is so clase to
GRStateManager::getLValue(), its worth having more comments (since it
does something a little different). I'm not certain what it's role is
compared to VisitLValue.
void GRExprEngine::VisitLValue(Expr* Ex, NodeTy* Pred, NodeSet& Dst) {
Ex = Ex->IgnoreParens();
@@ -935,6 +957,29 @@
// FIXME: Should we insert some assumption logic in here to
determine
// if "Base" is a valid piece of memory? Before we put this
assumption
// later when using FieldOffset lvals (which we no longer have).
Please add comments saying what this code is doing.
+
+ // Some kinds of struct base are mutable. Get base expr's location.
+ SVal BaseL;
+
+ if (M->isArrow()) {
+ BaseL = getLValue(St, Base);
+ } else {
+ Expr* TmpEx = Base->IgnoreParens();
+ // Check if the base expr is of (*p) pattern.
+ if (UnaryOperator* U = dyn_cast<UnaryOperator>(TmpEx)) {
+ assert(U->getOpcode() == UnaryOperator::Deref);
+ BaseL = getLValue(St, U->getSubExpr());
+ }
This peeking at the syntax of the subexpression doesn't look all that
clean, but partially because I don't understand the role of BaseL and
BaseV. Couldn't the subexpression really be anything that returns an
"lvalue"? For example, what about:
foo().f
[myobject foo].f
I'm all about handling such things (e.g., the subexpression) with
utility/recursive functions so we don't have to insert such grimy
special cases into the main logic of the clients.
There also appears to be cases here where 'BaseL' is never assigned a
value and left in the 'UnknownVal' state. Is that intentional? If
so, please comment.
+ }
+
+ SVal BaseV = GetSVal(St, Base);
What is the difference between BaseL and BaseV? This is a little
confusing.
+
+ // Check if the base can be NonNull.
+ bool isFeasibleNotNull = false;
+ St = Assume(St, BaseV, true, isFeasibleNotNull);
+
+ St = StateMgr.CheckStructBase(St, BaseV, BaseL, Base,
isFeasibleNotNull);
+
We should handle the two cases where 'BaseV' can be either null or not-
null. If 'BaseV' can be null, what do we want to return as the value
of this expression? For example, what does '&(x->f)' evaluate to when
'x' is null? Do we want to emit an error at this point (and stop) or
do we want to compute the address of 0x0+offsetof(f).
More information about the cfe-dev
mailing list