[cfe-dev] fix for Clang PR 8419

Ted Kremenek kremenek at apple.com
Fri Nov 19 13:54:46 PST 2010


On Nov 19, 2010, at 10:38 AM, Zhanyong Wan (λx.x x) wrote:

> Hi, Ted,
> 
> I made a new patch based on our discussion last night.  I've uploaded
> it to http://codereview.appspot.com/2920041/ (the raw patch can be
> found at http://codereview.appspot.com/download/issue2920041_2001.diff).
> 
> Things to note:
> 
> 1. In addition to trying to fix PR8419, I also added comments that
> helped me understand how the code I'm touching works.
> 
> 2. The meat of the fix is in CFG.cpp.  I'm still not sure if this is a
> principled fix, but all tests pass, so I hope it's at least an
> improvement.
> 
> Would you please take a look?  Thanks!

Hi Zhanyong,

I think there is goodness here.  I'd prefer this be separated up into separate commits.  For example, purely documentation comments that are unrelated to the functionality changes should go into a separate commit.

Comments inline.

> Index: test/Analysis/misc-ps-region-store.cpp
> ===================================================================
> --- test/Analysis/misc-ps-region-store.cpp	(revision 119815)
> +++ test/Analysis/misc-ps-region-store.cpp	(working copy)
> @@ -159,6 +159,25 @@
>    for (; ; x++) { }
>  }
>  
> +// PR8419 -- this used to crash.
> +
> +class String8419 {
> + public:
> +  char& get(int n);
> +  char& operator[](int n);
> +};
> +
> +char& get8419();
> +
> +void Test8419() {
> +  String8419 s;
> +  ++(s.get(0));
> +  get8419()++;  // used to crash
> +  ++s[0];       // used to crash
> +  s[0] &= 1;    // used to crash
> +  s[0]++;       // used to crash
> +}
> +
>  // PR8426 -- this used to crash.


Test case is reasonable.  This this also pass with the basic store?  If so, this should just go in misc-ps.m.

>  
>  void Use(void* to);
> Index: include/clang/Checker/PathSensitive/Store.h
> ===================================================================
> --- include/clang/Checker/PathSensitive/Store.h	(revision 119815)
> +++ include/clang/Checker/PathSensitive/Store.h	(working copy)
> @@ -22,6 +22,9 @@
>  
>  namespace clang {
>  
> +/// Store - This opaque type encapsulates an immutable map.  Different
> +///  subclasses of StoreManager may choose to put different types of
> +///  keys and values in a Store.
>  typedef const void* Store;

Instead of "immutable map", I would say "an immutable mapping from locations to values."  The second sentence is incorrect.  The entire purpose of the store is to map from locations to values; not arbitrary keys and values.  Different StoreManagers may represent this mapping as they choose.  At a high-level, the store represents the symbolic memory model.


>  
>  class GRState;
> Index: include/clang/Checker/PathSensitive/GRExprEngine.h
> ===================================================================
> --- include/clang/Checker/PathSensitive/GRExprEngine.h	(revision 119815)
> +++ include/clang/Checker/PathSensitive/GRExprEngine.h	(working copy)
> @@ -513,6 +513,10 @@
>  public:
>    // FIXME: 'tag' should be removed, and a LocationContext should be used
>    // instead.
> +  // FIXME: Comment on the meaning of the arguments, when 'St' may not
> +  // be the same as Pred->state, and when 'location' may not be the
> +  // same as state->getLValue(Ex).
> +  /// Simulate a read of the result of Ex.
>    void EvalLoad(ExplodedNodeSet& Dst, const Expr* Ex, ExplodedNode* Pred,
>                  const GRState* St, SVal location, const void *tag = 0,
>                  QualType LoadTy = QualType());
> @@ -522,7 +526,7 @@
>    void EvalStore(ExplodedNodeSet& Dst, const Expr* AssignE, const Expr* StoreE,
>                   ExplodedNode* Pred, const GRState* St, SVal TargetLV, SVal Val,
>                   const void *tag = 0);
> -private:  
> +private:
>    void EvalLoadCommon(ExplodedNodeSet& Dst, const Expr* Ex, ExplodedNode* Pred,
>                        const GRState* St, SVal location, const void *tag,
>                        QualType LoadTy);

Comment seems reasonable.


> Index: include/clang/Checker/PathSensitive/Environment.h
> ===================================================================
> --- include/clang/Checker/PathSensitive/Environment.h	(revision 119815)
> +++ include/clang/Checker/PathSensitive/Environment.h	(working copy)
> @@ -27,7 +27,12 @@
>  class ValueManager;
>  class LiveVariables;
>  
> +// FIXME: The name clang::Environment is too generic.  Find a more
> +// descriptive name (e.g. GREnvironment?).

Instead of a FIXME, why not just do this change?

>  
> +/// Environment - An immutable map from Stmts to their current
> +///  symbolic values (SVals).
> +///

Comment seems reasonable.  This is basically the dual of 'store.'  Instead of mapping from locations to values, it maps from expressions to values.  This essentially captures the current semantics of an expression as it was just evaluated by the analyzer engine.

>  class Environment {
>  private:
>    friend class EnvironmentManager;
> @@ -51,6 +56,9 @@
>      return X ? *X : UnknownVal();
>    }
>  
> +  /// Like LookupExpr, but doesn't require Ex to be in the map (in
> +  ///  which case the return value will either be UnknownVal() or
> +  ///  created by the ValueManager).
>    SVal GetSVal(const Stmt* Ex, ValueManager& ValMgr) const;

This is a high-level, public API; this level of detail in the comment seems a little weird.  The fact that values are lazily generated is an implementation detail.  I'd rather the comment just say that this fetches the current binding of the expression in the Environment.  The fact that the binding is implicitly or explicitly represented in the Environment is an implementation detail.

Also, LookupExpr is a private detail of the class.  Comparing GetSVal to LookupExpr is a little weird when it comes to documenting a method used by clients.

>  
>    /// Profile - Profile the contents of an Environment object for use
> Index: include/clang/Checker/PathSensitive/GRState.h
> ===================================================================
> --- include/clang/Checker/PathSensitive/GRState.h	(revision 119815)
> +++ include/clang/Checker/PathSensitive/GRState.h	(working copy)
> @@ -1,4 +1,4 @@
> -//== GRState*h - Path-Sens. "State" for tracking valuues -----*- C++ -*--==//
> +//== GRState.h - Path-sensitive "State" for tracking values -----*- C++ -*--==//

Good cleanup.

>  //
>  //                     The LLVM Compiler Infrastructure
>  //
> @@ -7,7 +7,7 @@
>  //
>  //===----------------------------------------------------------------------===//
>  //
> -//  This file defines SymbolRef, ExprBindKey, and GRState*
> +//  This file defines SymbolRef, ExprBindKey, and GRState*.
>  //
>  //===----------------------------------------------------------------------===//
>  
> @@ -53,15 +53,17 @@
>  };
>  
>  //===----------------------------------------------------------------------===//
> -// GRState- An ImmutableMap type Stmt*/Decl*/Symbols to SVals.
> +// GRState- An ImmutableMap from Stmt*/Decl*/Symbols to SVals.

This comment is actually out-of-date, since it isn't an ImmutableMap object anymore.

>  //===----------------------------------------------------------------------===//
>  
>  class GRStateManager;
>  
> -/// GRState - This class encapsulates the actual data values for
> -///  for a "state" in our symbolic value tracking.  It is intended to be
> -///  used as a functional object; that is once it is created and made
> -///  "persistent" in a FoldingSet its values will never change.
> +/// GRState - This class encapsulates the symbolic values of all Stmts
> +///  in the program and the constraints on the values at a particular
> +///  moment.

I would probably bullet it out.  The GRState represents:

(1) Mapping from locations to values (Store)
(2) Mapping from expressions to values (Environment)
(3) Constraints on symbolic values

Together these represent the "abstract state" of a program.

>  A client of this class may also embed arbitrary data in
> +///  it via a GenericDataMap.  GRState is intended to be used as a
> +///  functional object; that is, once it is created and made
> +///  "persistent" in a FoldingSet, its values will never change.
>  class GRState : public llvm::FoldingSetNode {
>  public:
>    typedef llvm::ImmutableSet<llvm::APSInt*>                IntSetTy;
> @@ -73,9 +75,9 @@
>    friend class GRStateManager;
>  
>    GRStateManager *StateMgr;
> -  Environment Env;
> +  Environment Env;           // Maps a Stmt to its current SVal.
>    Store St;
> -  GenericDataMap   GDM;
> +  GenericDataMap   GDM;      // custom data stored by a client of this class

Please keep comments consistent.  Start with capital letters; end with with periods.

>  
>    /// makeWithStore - Return a GRState with the same values as the current
>    ///  state with the exception of using the specified Store.
> @@ -120,8 +122,9 @@
>  
>    void setGDM(GenericDataMap gdm) { GDM = gdm; }
>  
> -  /// Profile - Profile the contents of a GRState object for use
> -  ///  in a FoldingSet.
> +  /// Profile - Profile the contents of a GRState object for use in a
> +  ///  FoldingSet.  Two GRState objects are considered equal if they
> +  ///  have the same Environment, Store, and GenericDataMap.

Looks great.

>    static void Profile(llvm::FoldingSetNodeID& ID, const GRState* V) {
>      V->Env.Profile(ID);
>      ID.AddPointer(V->St);
> @@ -163,19 +166,14 @@
>    //  (3) A binary value "Assumption" that indicates whether the constraint is
>    //      assumed to be true or false.
>    //
> -  // The output of "Assume" are two values:
> +  // The output of "Assume*" is a new GRState object with the added constraints.
> +  // If no new state is feasible, NULL is returned.

Looks great.

>    //
> -  //  (a) "isFeasible" is set to true or false to indicate whether or not
> -  //      the assumption is feasible.
> -  //
> -  //  (b) A new GRState object with the added constraints.
> -  //
> -  // FIXME: (a) should probably disappear since it is redundant with (b).
> -  //  (i.e., (b) could just be set to NULL).
> -  //
>  
>    const GRState *Assume(DefinedOrUnknownSVal cond, bool assumption) const;
> -  
> +
> +  // FIXME: The type of this method doesn't match the comment above.
> +  // Comment on what this really does.

Here's what this method does.  The above version of Assume takes an 'assumption' parameter.  This one doesn't.  This method assumes both "true" and "false" for 'cond', and returns both corresponding states.  It's shorthand for doing 'Assume' twice.

>    std::pair<const GRState*, const GRState*>
>    Assume(DefinedOrUnknownSVal cond) const;
>  
> @@ -194,9 +192,7 @@
>    //==---------------------------------------------------------------------==//
>  
>    /// BindCompoundLiteral - Return the state that has the bindings currently
> -  ///  in 'state' plus the bindings for the CompoundLiteral.  'R' is the region
> -  ///  for the compound literal and 'BegInit' and 'EndInit' represent an
> -  ///  array of initializer values.
> +  ///  in this state plus the bindings for the CompoundLiteral.

Looks good.

>    const GRState *bindCompoundLiteral(const CompoundLiteralExpr* CL,
>                                       const LocationContext *LC,
>                                       SVal V) const;
> Index: include/clang/Checker/PathSensitive/SVals.h
> ===================================================================
> --- include/clang/Checker/PathSensitive/SVals.h	(revision 119815)
> +++ include/clang/Checker/PathSensitive/SVals.h	(working copy)
> @@ -39,13 +39,26 @@
>  class GRStateManager;
>  class ValueManager;
>  
> +/// SVal - This represents a symbolic expression, which can be either
> +///  an L-value or an R-value.
> +///

Looks good.

>  class SVal {
>  public:
> -  enum BaseKind { UndefinedKind, UnknownKind, LocKind, NonLocKind };
> +  enum BaseKind {
> +    // The enumerators must be representable using 2 bits.
> +    UndefinedKind = 0,  // for subclass UndefinedVal (an uninitialized value)
> +    UnknownKind = 1,    // for subclass UnknownVal (a void value)
> +    LocKind = 2,        // for subclass Loc (an L-value)
> +    NonLocKind = 3      // for subclass NonLoc (an R-value that's not
> +                        //   an L-value)
> +  };
>    enum { BaseBits = 2, BaseMask = 0x3 };
>  
>  protected:
>    const void* Data;
> +
> +  /// The lowest 2 bits are a BaseKind (0 -- 3).
> +  ///  The higher bits are an unsigned "kind" value.
>    unsigned Kind;
>  
>  protected:
> @@ -200,7 +213,7 @@
>      return V->getBaseKind() == UnknownKind;
>    }
>  };
> -  
> +
>  class DefinedSVal : public DefinedOrUnknownSVal {
>  private:
>    // Do not implement.  We want calling these methods to be a compiler
> Index: lib/Analysis/CFG.cpp
> ===================================================================
> --- lib/Analysis/CFG.cpp	(revision 119815)
> +++ lib/Analysis/CFG.cpp	(working copy)
> @@ -900,12 +900,29 @@
>    return VisitChildren(S);
>  }
>  
> +// Returns true if a CFGBuilder should treat 'S' as an L-value when
> +// visiting it.
> +static bool ShouldTreatAsLValue(Stmt* S) {
> +  // A call to a reference-returning function is an L-value.
> +  if (CallExpr* CallEx = dyn_cast<CallExpr>(S))
> +    return CallEx->getCallReturnType()->isReferenceType();
> +
> +  // Otherwise, we assume it's not an L-value.
> +  // FIXME: document why this assumption is OK, or fix this in a
> +  // principled way if it's not.
> +  return false;
> +}
> +
>  /// VisitChildren - Visit the children of a Stmt.
>  CFGBlock *CFGBuilder::VisitChildren(Stmt* Terminator) {
>    CFGBlock *B = Block;
>    for (Stmt::child_iterator I = Terminator->child_begin(),
>           E = Terminator->child_end(); I != E; ++I) {
> -    if (*I) B = Visit(*I);
> +    if (*I) {
> +      B = Visit(*I, ShouldTreatAsLValue(*I) ?
> +                AddStmtChoice::AsLValueNotAlwaysAdd :
> +                AddStmtChoice::NotAlwaysAdd);
> +    }
>    }
>    return B;
>  }

I don't this is right.  I'll comment in a separate email.


> Index: lib/Checker/Environment.cpp
> ===================================================================
> --- lib/Checker/Environment.cpp	(revision 119815)
> +++ lib/Checker/Environment.cpp	(working copy)
> @@ -101,7 +101,8 @@
>  Environment EnvironmentManager::bindExprAndLocation(Environment Env,
>                                                      const Stmt *S,
>                                                      SVal location, SVal V) {
> -  return Environment(F.Add(F.Add(Env.ExprBindings, MakeLocation(S), V), S, V));
> +  return Environment(F.Add(F.Add(Env.ExprBindings, MakeLocation(S), location),
> +                           S, V));
>  }

This is a good bug fix, and should go into a patch of its own.

>  
>  namespace {
> Index: lib/Checker/GRExprEngine.cpp
> ===================================================================
> --- lib/Checker/GRExprEngine.cpp	(revision 119815)
> +++ lib/Checker/GRExprEngine.cpp	(working copy)
> @@ -909,7 +909,7 @@
>        break;
>      }
>  
> -    case Stmt::CXXMemberCallExprClass: {
> +    case Stmt::CXXMemberCallExprClass:  {
>        const CXXMemberCallExpr *MCE = cast<CXXMemberCallExpr>(S);
>        VisitCXXMemberCallExpr(MCE, Pred, Dst);
>        break;
> @@ -1944,10 +1944,11 @@
>      EvalBind(Dst, StoreE, *NI, GetState(*NI), location, Val);
>  }

What changed here?

>  
> -void GRExprEngine::EvalLoad(ExplodedNodeSet& Dst, const Expr *Ex, 
> +void GRExprEngine::EvalLoad(ExplodedNodeSet& Dst, const Expr *Ex,
>                              ExplodedNode* Pred,
>                              const GRState* state, SVal location,
>                              const void *tag, QualType LoadTy) {
> +  assert(!isa<NonLoc>(location) && "location cannot be a NonLoc.");

Is it possible to change 'location' to a Loc, instead of having this assertion?


>  
>    // Are we loading from a region?  This actually results in two loads; one
>    // to fetch the address of the referenced value and one to fetch the
> @@ -3154,7 +3155,7 @@
>    assert (U->isIncrementDecrementOp());
>    ExplodedNodeSet Tmp;
>    const Expr* Ex = U->getSubExpr()->IgnoreParens();
> -  VisitLValue(Ex, Pred, Tmp);
> +  VisitLValue(Ex, Pred, Tmp);  // This modifies Tmp.

Do you think this comment is necessary?  This happens all over the place within the engine.

>  
>    for (ExplodedNodeSet::iterator I = Tmp.begin(), E = Tmp.end(); I!=E; ++I) {
>  
> @@ -3163,7 +3164,7 @@
>  
>      // Perform a load.
>      ExplodedNodeSet Tmp2;
> -    EvalLoad(Tmp2, Ex, *I, state, V1);
> +    EvalLoad(Tmp2, Ex, *I, state, V1);  // This modifies Tmp2.

Do you think this comment is necessary?  This happens all over the place within the engine.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20101119/0656e281/attachment.html>


More information about the cfe-dev mailing list