[cfe-dev] [Analyzer] Obtain MemRegion corresponding to an pointer expression that has been cast to a different type

Ted Kremenek via cfe-dev cfe-dev at lists.llvm.org
Wed Aug 19 08:57:43 PDT 2015


Hi Scott,

I don’t actually see a reason here why you need to even look at the structure of the AST here.  The analyzer does a full symbolic execution, so there is a powerful separation between syntax and semantics right at your fingertips.

I would approach this from a different angle.  Once you have the location, in this case, ‘l’, it should be an ElementRegion.  That will represent the cast from original MemRegion (a VarRegion) to uint8_t*.  Then just strip off the ElementRegion.  The MemRegion design captures how the casts were used to change the interpretation of a piece of memory.  It’s all right there in the MemRegion hierarchy.

AST-based approaches like this are fundamentally very brittle.  For example, you would need to do something different if the code was instead written like this:

  void foo() {
    struct S x;
   uint8_t *y = (uint8_t *)&x;
   bar(y);
  }

If you just use the MemRegions directly, these syntactic differences are irrelevant.  The MemRegions capture the actual semantics of the value you are working with.  In this case, the analyzer knows that the original memory address is for the VarRegion for ‘x’.

Typically if you find yourself going to the AST itself to do these kind of operations, the approach is inherently wrong.  Syntactic approaches work reasonably well for the compiler, where cheap local analysis is all you have.  For the static analyzer, there is so much semantics captured in the ProgramState that you can go far beyond the reasoning power of syntactic checks like this.

Cheers,
Ted

> On Aug 19, 2015, at 8:44 AM, scott constable via cfe-dev <cfe-dev at lists.llvm.org> wrote:
> 
> Hi All,
> 
> I'm analyzing something like the following code:
> 
> struct S {
>   int a;
>   char b;
>   int c;
> }
> 
> void foo() {
>   struct S x;
>   bar((uint8_t *)&x);
> }
> 
> When I reach the CallEvent corresponding to the call to bar(), I would like to extract the MemRegion corresponding to x, i.e. by ignoring the (uint8_t *) cast. My code looks something like this:
> 
> const Expr *arg = Call.getArgExpr(0);
> SVal addrVal = State->getSVal(arg, LCtx);
> Optional<Loc> l = addrVal.getAs<Loc>();
> if (!l) // must be a null pointer
> 	return nullptr;
> 
> QualType T = getPointedToType(E);
> return State->getSVal(*l, T).getAsRegion();
> 
> where getPointedToType() is defined as
> 
> getPointedToType(const Expr *E) {
> 	assert(E);
> 	if (!isPointer(E))
> 		return QualType();
> 	if (const CastExpr *cast = dyn_cast<CastExpr>(E))
> 		return getPointedToType(cast->getSubExpr());
> 
> 	const PointerType *Ty =
> 		dyn_cast<PointerType>(E->getType().getCanonicalType().getTypePtr());
> 	if (Ty)
> 		return Ty->getPointeeType();
> 	return QualType();
> }
> 
> Everything seems to work just fine, until the call to State->getSVal(*l, T), which returns a NonLoc. If I instead call State->getSVal(*l) without the pointed-to type, then I do get a MemRegion, but it's an element region of type uint_8, NOT what I want.
> 
> Am I doing something wrong? Is there a much easier way to do this?
> 
> ~Scott Constable
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at lists.llvm.org
> https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.llvm.org_cgi-2Dbin_mailman_listinfo_cfe-2Ddev&d=BQIGaQ&c=eEvniauFctOgLOKGJOplqw&r=UVc407_CCx3FapxjS2xZ9jo4Q91upSGpJHRF8fPPYVY&m=kO3mADPT6iSj6j0bsR1t_h-zUwpU5pIswmJrYE52JpY&s=lDOFrm1CLnG-VY9ygoKFkayV7KRSC5BEgo-k_jJdf9k&e= 




More information about the cfe-dev mailing list