[cfe-commits] r75356 - in /cfe/trunk: lib/Analysis/GRExprEngine.cpp test/Analysis/misc-ps.m

Ted Kremenek kremenek at apple.com
Mon Jul 13 16:37:55 PDT 2009


On Jul 11, 2009, at 3:42 AM, Zhongxing Xu wrote:

> Hi Ted,
>
> Here is another fix for this bug. Instead of recovering from a wrong
> invalidation, this patch aims to invalidate the region correctly. It
> uses the cast-to type to invalidate the region when available. To
> avoid invalid cast-to type like 'void*' or 'id', region store now only
> records non-generic casts of regions.
>
> On Sat, Jul 11, 2009 at 12:38 PM, Ted Kremenek<kremenek at apple.com>  
> wrote:
>> Author: kremenek
>> Date: Fri Jul 10 23:38:49 2009
>> New Revision: 75356
>>
>> URL: http://llvm.org/viewvc/llvm-project?rev=75356&view=rev
>> Log:
>> Handle insidious corner case exposed by RegionStoreManager when  
>> handling void* values that are bound
>> to symbolic regions and then treated like integers.
>>
>> Modified:
>>    cfe/trunk/lib/Analysis/GRExprEngine.cpp
>>    cfe/trunk/test/Analysis/misc-ps.m
>


Hi Zhongxing,

I spent some time looking over this patch and testing it's behavior.   
I think we're on the right track, but the patch itself isn't where we  
need it yet.  First I'll make some comments inline to mention some  
specific points, and then mention some other issues I observed.

Index: include/clang/Analysis/PathSensitive/Store.h
===================================================================
--- include/clang/Analysis/PathSensitive/Store.h	(revision 75492)
+++ include/clang/Analysis/PathSensitive/Store.h	(working copy)
@@ -145,6 +145,10 @@
      return state;
    }

+  virtual const QualType *getCastType(const GRState *state, const  
MemRegion *R){
+    return 0;
+  }
+

Seems fine.  This is our notion of type information on the side.  It  
isn't as rich as it needs to be in the long term, but it's a good start.


    /// EvalBinOp - Perform pointer arithmetic.
    virtual SVal EvalBinOp(const GRState *state,  
BinaryOperator::Opcode Op,
                           Loc lhs, NonLoc rhs, QualType resultTy) {
Index: lib/Analysis/RegionStore.cpp
===================================================================
--- lib/Analysis/RegionStore.cpp	(revision 75492)
+++ lib/Analysis/RegionStore.cpp	(working copy)
@@ -327,6 +327,10 @@
    const GRState *setCastType(const GRState *state, const MemRegion* R,
                               QualType T);

+  const QualType *getCastType(const GRState *state, const MemRegion  
*R) {
+    return state->get<RegionCasts>(R);
+  }
+

Makes sense.


    static inline RegionBindingsTy GetRegionBindings(Store store) {
     return RegionBindingsTy(static_cast<const  
RegionBindingsTy::TreeTy*>(store));
    }
@@ -348,7 +352,26 @@
  };

  } // end anonymous namespace
+static bool isGenericPtr(ASTContext &Ctx, QualType Ty) {
+  if (Ty->isObjCIdType() || Ty->isObjCQualifiedIdType())
+    return true;
+  while (true) {
+    Ty = Ctx.getCanonicalType(Ty);

+    if (Ty->isVoidType())
+      return true;
+
+    if (const PointerType *PT = Ty->getAsPointerType()) {
+      Ty = PT->getPointeeType();
+      continue;
+    }
+
+    break;
+  }
+
+  return false;
+}
+

I have mixed feelings about this.  The problem is that there are wide  
range of generic pointers, e.g. "char *" is also a generic pointer.  I  
guess as long as we are using this just to handle transient type  
erasure then this seems okay.

  // 
= 
= 
=---------------------------------------------------------------------- 
===//
  // RegionStore creation.
  // 
= 
= 
=---------------------------------------------------------------------- 
===//
@@ -1251,6 +1274,8 @@

  const GRState *RegionStoreManager::setCastType(const GRState *state,
  					       const MemRegion* R, QualType T) {
+  if (isGenericPtr(getContext(), T))
+    return state;
    return state->set<RegionCasts>(R, T);
  }

Makes sense.  I think we should add a comment about why the check for  
'isGenericPtr' is there.  Also, we should document that if the region  
already has a cast type, casting it to void* doesn't remove that cast  
type.

What should the semantics be if there already is a cast type  
associated with a region?


Index: lib/Analysis/Store.cpp
===================================================================
--- lib/Analysis/Store.cpp	(revision 75492)
+++ lib/Analysis/Store.cpp	(working copy)
@@ -235,8 +235,16 @@

    const TypedRegion *TR = cast<TypedRegion>(R);

-  QualType T = TR->getValueType(Ctx);
+  QualType T;

+  // If the region is cast to another type, use that type.
+  const QualType *CastTy = getCastType(state, R);
+  if (CastTy) {
+    T = (*CastTy)->getAsPointerType()->getPointeeType();
+  }

As a nitpick, I strongly prefer the style:

   if (const QualType *CastTy = ...)

   to

   const QualType *CastTy;
   if (CastTy = ...)

The first case takes less code and restricts the scope of CastTy to  
the 'if' statement.  I like restricting scope of local variables to  
just the place where they are logically used, and no bigger.  It  
allows people reading the code to understand immediately that the  
value isn't used later.  It also reduces some subtle programming  
errors from a variable (incorrectly) being re-used.

Second, this doesn't handle the case anymore where the value is an  
Objective-C pointer, since Objective-C pointers are now represented  
using ObjCObjectPointerType.  That actually might not be an issue  
since those regions shouldn't be boundable, but it's worth putting in  
an assertion.

+  else
+    T = TR->getValueType(Ctx);
+
    if (Loc::IsLocType(T) || (T->isIntegerType() && T->isScalarType 
())) {
      SVal V = ValMgr.getConjuredSymbolVal(E, T, Count);
      return Bind(state, ValMgr.makeLoc(TR), V);

Looking at the rest of the InvalidateRegion code, I'm actually a  
little dubious.  I know you didn't write it, but I think we should  
consider what the general solution should be here.  Consider:

     int x;
     struct s *p = (struct s*) &x;
     foo(p);

This code might actually be valid if 'sizeof(struct s) <= sizeof(int)'  
and the alignments of both 'int' and 'struct s' are compatible.   
Otherwise, we'll get a potential buffer overflow that we won't catch.  
Alternatively, we might not actually be doing the right thing.  I  
think this is something worth discussing.

Index: lib/Analysis/GRExprEngine.cpp
===================================================================
--- lib/Analysis/GRExprEngine.cpp	(revision 75492)
+++ lib/Analysis/GRExprEngine.cpp	(working copy)
@@ -1119,9 +1119,9 @@
      //  invalidate(y);  // 'x' now binds to a symbolic region
      //  int z = *y;
      //
-    if (isa<Loc>(V) && !Loc::IsLocType(Ex->getType())) {
-      V = EvalCast(V, Ex->getType());
-    }
+    //if (isa<Loc>(V) && !Loc::IsLocType(Ex->getType())) {
+    //  V = EvalCast(V, Ex->getType());
+    // }

This is the main change, but I cannot actually evaluate it yet.  Using  
the '-analyzer-viz-egraph-graphviz' option, I saw that for the  
following code:

     int *__gruep__ = ((int *)&((b)->grue));
     int __gruev__ = *__gruep__;

that all we end up doing is conjuring a new symbol for the second  
declaration of '__gruev__' (even when using RegionStoreManager).  This  
can be seen readily with some enhancements I made today to the pretty- 
printing code for regions and symbols.  This issue may be related to  
the test case I added to 'misc-ps-region-store.m' (which causes that  
file to now XFAIL).

I think your patch is a step in the right direction, and aside from  
the few comments I made I'd be happy with applying it and iterating on  
the design from there.

Ted
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20090713/5d179093/attachment.html>


More information about the cfe-commits mailing list