[cfe-commits] [PATCH] Set region size in GRRegionVals transfer function

Ted Kremenek kremenek at apple.com
Fri Nov 7 20:16:21 PST 2008


On Nov 7, 2008, at 7:56 PM, Zhongxing Xu <xuzhongxing at gmail.com> wrote:
>
> On Sat, Nov 8, 2008 at 1:01 AM, Ted Kremenek <kremenek at apple.com>  
> wrote:
>
> On Nov 7, 2008, at 5:34 AM, Zhongxing Xu wrote:
>
> Attached is a scratch implementation of array bound checking. It  
> seems that implementing array bound checking is pretty  
> straightforward. No need to make big change to the framework.
> <oob.patch>
>
> Wow!  Great work.  This is very simple.  This is a very good start.   
> I think we can apply it now and then iterate.
>
> One thing that I think is worth mentioning is that not all  
> MemRegions represent chunks of memory for which we want to enforce  
> strict software segmentation.  Consider the following code:
>
> struct S {
>  unsigned length;
>  char buf[];
> };
>
> unsigned getLength(char* data) {
>  struct S* s = (struct S*) (data - offsetof(struct S, buf));
>  return s->length;
> }
>
> I may not have written this correctly, but I have seen code like  
> this before in some large open source C project (I don't remember  
> which).  I'm not certain if such code poses a challenge for out-of- 
> region checking, as conceptually we might represent the above code  
> with several regions layered on top of each other.
>
> For example, the 'data' argument could be a pointer to a MemRegion  
> representing a character array with an extent of $extent(data) (I'm  
> using '$' to represent a symbolic value).  This extent would be a  
> subregion of a region representing a 'struct S' object.  Through  
> pointer arithmetic, we get a pointer value that refers to the front  
> of the 'struct S' object.  The subsequent access through 's->length'  
> first causes us to get the FieldRegion for 'length' of 's' and then  
> perform the load (which works as expected).
>
> I know this may seem like a contrived example, and it might not be  
> something we should even care about right now, but I thought it  
> would bring it up.
>
> Another use of MemRegions that I thought of was bit-level typing  
> (which I'm not saying we should implement right now, or ever).  For  
> example, suppose you have an unsigned integer variable whose bits  
> are used as flags.  While some programmers may use bit fields for  
> this task, others just use shifts and masks.  MemRegions provide a  
> nice abstraction to segment out the individual bits of an integer,  
> allowing us to potentially perform bit-level typing (http://portal.acm.org/citation.cfm?doid=1181775.1181791 
> ) or simply have better symbolic reasoning for bit values.  While  
> one cannot take the address of a bit, one can take the address of  
> specific words, compute the address of related words using pointer  
> arithmetic, etc., and everything is fine (no out-of-region access  
> errors).
>
> My meta point here is that we probably need a simple interface in  
> StoreManager to determine if accessing beyond the bounds of an  
> extent is okay.  Sometimes it is okay to access out of the current  
> region as long as it doesn't exceed the bounds of some ancestor  
> region (which represents the allocated buffer, for example).   
> Perhaps that means we need to reason about "canonical  
> locations" (similar to how SourceManager reasons about logical and  
> physical locations), where canonical locations could represent the  
> byte location within a chunk of segmented memory.  I honestly don't  
> know.
>
> I don't think these details are high on the priority list of things  
> to worry about, but I thought I would mention them now before we  
> hardwire our implementation of array-bounds checking with too many  
> assumptions.
>
> Understood. We can watch these cases when we make relevant design  
> decisions in the future.

Agreed.

>
> My feeling is that StoreManager should take the responsibility of  
> doing correct pointer arithmetic or bit-level reasoning. It is up to  
> the StoreManager to return the correct region. Therefore other parts  
> of the analyzer could stay unknown of the details.

Agreed.  Hopefully we will be able to stay within such a modular design.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20081107/175d332e/attachment.html>


More information about the cfe-commits mailing list