This scheme looks great to me! Two points:<br><br>a. Symbolic regions may have types. For example: void foo(char* buf) {...}. 'buf' => SymbolicRegion with type char*<br><br>b. Applying the scheme to your test case:<br>

<br>// <rdar://problem/6451816><br>

typedef unsigned char Boolean;<br>

typedef const struct __CFNumber * CFNumberRef;<br>

typedef signed long CFIndex;<br>

typedef CFIndex CFNumberType;<br>

typedef unsigned long UInt32;<br>

typedef UInt32 CFStringEncoding;<br>

typedef const struct __CFString * CFStringRef;<br>

extern Boolean CFNumberGetValue(CFNumberRef number, CFNumberType theType, void *valuePtr);<br>

extern CFStringRef CFStringConvertEncodingToIANAC<div id=":17" class="ii gt">arSetName(CFStringEncoding encoding);<br>

<br>

CFStringRef rdar_6451816(CFNumberRef nr) {<br>

  CFStringEncoding encoding;<br>

  // &encoding is casted to void*.  This test case tests whether or not<br>

  // we properly invalidate the value of 'encoding'.<br>

  CFNumberGetValue(nr, 9, &encoding);<br>

  return CFStringConvertEncodingToIANACharSetName(encoding); // no-warning<br>

}<br><br>We would have to invalidate a TypedViewRegion(void*, VarRegion('encoding')). Then StoreManager should canonicalize it to VarRegion('encoding'). <br>Does that mean we have a new rule for canonicalization of discarding void* typed view besides discarding typedefs? That is, void* typed view is another kind of 'sugar' like typedefs?<br>

</div><br><br><div class="gmail_quote">On Wed, Mar 11, 2009 at 5:08 AM, Ted Kremenek <span dir="ltr"><<a href="mailto:kremenek@apple.com">kremenek@apple.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">

<div class="im"><br>

On Mar 10, 2009, at 12:40 AM, Zhongxing Xu wrote:<br>

<br>

<blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">

<br>

I see your point. But how can we recognize which casts are semantic changes and which are sugar typed?<br>

I doubt there is a consistent rule for this.<br>

<br>

</blockquote>

<br></div>

I'm fairly certain there is a consistent rule for this.  Actually, it is very simple: semantic changes due to type views on regions occur when the view would result in the chunk of memory being interpreted differently.  In the case of typedefs, they are just sugar for the underlying type, and don't change the semantics.<br>


<br>

I've worked out a simple example to explain a proposal I have for possibly dealing with this issue.  The basic rules are as follows:<br>

<br>

(a) Symbolic regions potentially don't have types other than "void*".  That is, they are just chunks of memory.<br>

<br>

(b) We "canonicalize" regions when performing bindings to them, but don't need to canonicalize regions when they are used as values for expressions (see example below).<br>

<br>

(c) We strip away type views when performing pointer comparisons.  The type-checking in the frontend will tell us whether or not a pointer comparison is legal.<br>

<br>

Here is my example.  There are other cases to consider, such as how to handle Objective-C and C++ objects, but I think this is a good starting point:<br>

<br>

<br>

typedef struct s* MyPointer;<br>

void* foo(void);<br>

<br>

   void bar(void) {<br>

1.   MyPointer q = foo();<br>

2.   void *r = q;<br>

3.   struct s* p = r;<br>

4.   if (p == q)<br>

5.      q->f = 10;<br>

6.   if (q->f == 10)<br>

        ...<br>

7.   if (p->f == 10)<br>

   }<br>

<br>

<br>

What happens:<br>

<br>

(a) On line 1 'foo()' evaluates to SymbolicRegion(Sym0).<br>

<br>

(b) On line 1 the value of (a) is then casted to<br>

    TypedViewRegion(MyPointer, SymbolicRegion(Sym0)).<br>

<br>

(c) On line 1 that value is bound to 'q'.  Internally it is stored as<br>

    'q' => TypedViewRegion(MyPointer, SymbolicRegion(Sym0)).  Observe that the<br>

    typedef is retained as it is used as a value in the binding and not the key.<br>

<br>

    Note that the syntax:<br>

<br>

     'q' => some value<br>

<br>

    is shorthand for<br>

<br>

      VarRegion('q') => some value<br>

<br>

(d) On line 2 we perform a load from 'q' (i.e., VarRegion('q')) to get the value<br>

    TypedViewRegion(MyPointer, SymbolicRegion(Sym0)).<br>

<br>

(e) On line 2 we cast (d) to void*.  This results<br>

    in stripping the 'TypedViewRegion' and getting 'SymbolicRegion(Sym0)'.  This<br>

    value is directly bound to 'r' to get the binding<br>

   'r' => SymbolicRegion(Sym0).<br>

<br>

(f) On line 3 we load from 'r' to get SymbolicRegion(Sym0).<br>

<br>

(g) On line 3 we cast (f) to 'struct s*', leaving us with the region<br>

    TypedViewRegion(struct s*, SymbolicRegion(Sym0)).<br>

<br>

(h) On line 3 we bound the value of (g) to 'p'.  This leaves us with the<br>

    binding 'p' => TypedViewRegion(struct s*, SymbolicRegion(Sym0)).<br>

<br>

(i) On line 4 we load both p and q, getting the values<br>

     TypedViewRegion(struct s*, SymbolicRegion(Sym0)) (for 'p') and<br>

     TypedViewRegion(MyPointer, SymbolicRegion(Sym0)) (for 'q').<br>

<br>

   We then do a pointer comparison by erasing their "views" and see if they<br>

   refer to the same memory object.  This means getting the first "non-view"<br>

   ancestor region.  In this case, the ancestor region is SymbolicRegion(Sym0)<br>

   in both cases.  This causes 'p == q' to evaluate to true.<br>

<br>

   The rationale behind this view-stripping is that we are just comparing<br>

   pointer addresses at this point, and "views" reason about the interpretation<br>

   of the bytes within a region but not the address of the region itself.<br>

<br>

   The "base region" doesn't have to be a SymbolicRegion. For example it could<br>

   be a VarRegion.<br>

<br>

(j) On line 5 we evaluate the l-value of 'q->f'.  This value is<br>

    FieldRegion('f',TypedViewRegion(MyPointer, SymbolicRegion(Sym0))).<br>

<br>

(k) On line 5 we ask the store manager to store '10' to the l-value computed<br>

    at (j).  Internally, within the StoreManager, we translate:<br>

<br>

      FieldRegion('f',TypedViewRegion(MyPointer, SymbolicRegion(Sym0))).<br>

<br>

    to:<br>

<br>

      FieldRegion('f',TypedViewRegion(struct s*, SymbolicRegion(Sym0))).<br>

<br>

    and then bind '10' to the latter.<br>

<br>

    Here we have "de-sugared" the TypedViewRegion so that all typedefs are<br>

    translated to the underlying type.<br>

<br>

    The rationale for this de-sugaring is that we want to "canonicalize" the key<br>

    (and *not* the value) in the binding. This makes retrievals from the store<br>

    easy in the presence of typedefs (which don't change the semantics of how a<br>

    region is interpreted).<br>

<br>

(l) One line 6 we compute the l-value of 'q->f' to be<br>

    FieldRegion('f',TypedViewRegion(MyPointer,SymbolicRegion(0))) and then ask<br>

    the store manager to load from that region.<br>

<br>

    Within the StoreManager, the key is again canonicalized to<br>

    FieldRegion('f',TypedViewRegion(struct s*, SymbolicRegion(0))).<br>

<br>

    The retrieval from the store then uses this key, getting the value 10.<br>

<br>

(m) Similarly, on line 7 we compute the l-value of 'p->f' to be<br>

    FieldRegion('f',TypedViewRegion(struct s*, SymbolicRegion(0))) and then ask<br>

    the store manager to load from that region.<br>

<br>

    Within the StoreManager that key is also canonicalized, but since it is<br>

    already canonical we just do the load.<br>

<br>

</blockquote></div><br>