[llvm-commits] Specification for Run-time Checks

Fri May 11 07:38:36 PDT 2012

On 5/11/12 3:47 AM, Alexander Potapenko wrote:
> John,
>
> What if one needs to ensure that his load or store is within an object
> of unknown size (think polymorphism)?

The lscheck (which will be split into loadcheck and storecheck) is 
responsible for determining the bounds of the memory object into which 
the pointer points if that information is needed to perform the run-time 
check.  The bounds information is recorded by 
pool_register_heap/stack/global() if necessary.

In the case of SAFECode, pool_register_*() records the bounds 
information in a splay tree, and lscheck consults that splay tree to see 
if the pointer it is checking falls within the bounds of a valid object 
and, if so, whether the last byte of the memory access is within the 
same object.

Polymorphic code shouldn't be a problem.

> Will such a situation be emulated with a fastgepcheck(src, dest, base, 0)?
> (BTW what is |base| and isn't it the same as |src|?)

No.   If you're checking a GEP %r = gep %q, x, y, ..., then the result 
%r is dest and %q is source (i.e., %r is computed as an offset from 
%q).  The base argument is the first address of the memory object into 
which src and dest should point, and the length argument is the size of 
that memory allocation.  Remember that fastgepcheck is used when static 
analysis can determine which memory allocation site generate the memory 
object into which the pointer should point.

The parameters to a fastgepcheck/fastlscheck can be dynamically 
computed.  If you did something like:

%i = <some dynamically computed value>
%p = alloc char, %i // Allocate an array of 5 chars
%q = gep %p, 0, 3
%r = gep %q, 0, 2

..., then if you wanted a dynamic check on %r, you could do:

fastgepcheck (%q, %r, %p, %i)

which would check that the result of %q points into the same memory 
object as %q.  The optimization that changes gepcheck to fastgepcheck 
would have concluded that %r should be pointing into the alloca 
referenced by %p and that the number of bytes allocated by the alloca is %i.

>
> I also think that it may be useful to pass an additional argument that
> defines the pool being used to pool_[un]register_heap.
> On some systems several allocators may be present in the program, and
> it might be a good idea to distinguish between them.

First, I'm realizing that I should probably rename those functions so 
that they don't have the "pool" name.  That's a left-over from basing 
them off of the versions used in SAFECode.

Second, you're right that we need a tag argument to identify the type of 
allocator if multiple allocators are recognized (we did that in SVA for 
Linux).  That said, I'm not sure if we want to support non-malloc 
allocators in the initial design.

What we could do is replace the _stack, _heap, and _global versions of 
pool_register with a single version that takes an "allocator type" 
parameter.  The allocator type 0 would be global, 1 would be heap, 2 
would be malloc, and anything else could be reserved for future 
allocator support.

I'll go ahead and make the above change to the design document.

-- John T.