[llvm-commits] change objectsize signature

Tue May 8 13:35:21 PDT 2012

>> The planed usage is the following:
>>
>> a[i] = ..
>>
>> converts to (using some loose syntax, and assuming that a[i] writes  
>> 4 bytes):
>>
>> %ptr = GEP(%a, %i)
>> if (objectsize(%ptr) >= 4)
>>  // OK
>> else
>>  __builtin_trap()
>>
>> (this is exactly what the current implementation does;
>
> I'm confused here: what is generating this code?  Clang? Dragonegg?   
> I assume we're talking about LLVM IR that was generated for C code.   
> Is my assumption wrong?

It can be C, yes. For example, clang can already emit this code.

>> I'm just proposing to extend objectsize to return non-constant values)
>
> Okay.  I don't think your original proposal was very clear.  You  
> want objectsize to return non-constant values so that it is more  
> effective at implementing these run-time checks; objectsize will not  
> be doing the check itself.  Do I understand correctly?

Yes!

> %p = load %q
> objectsize (%p)
>
> If you're doing lookups, how do you add/remove memory objects in the  
> lookup data structure?  If the addition/removal is not done at the  
> LLVM IR level, then how can it be optimized away when it is not  
> needed?

That's up to the implementation, but my proposal is not to go that  
far. Fat pointers, for example, usually incur in a big performance  
penalty.
But you can always extend the default implementation of objectsize to  
perform more checks. Also, that's why the proposal includes a  
"penalty" parameter, so that the user can specify how much performance  
he is willing to loose.

> o) Your check does not guarantee that %a and %ptr belong to the same  
> memory object.  All you're guaranteeing is that %ptr points into a  
> valid memory object with 4 bytes remaining in the memory object.   
> It's possible that %ptr overflowed the object pointed to by %a and  
> is now within the bounds of another memory object (stack objects  
> often have no padding in between them).  That's fine, but then it's  
> not an array indexing check; it's a load/store check.

Well, it's a GEP from %a, so it must point somewhere inside the buffer  
of %a. objectsize should not allow buffer overflows.

> o) The object size lookup (if you have one), object size comparison,  
> and trap are not atomic, which means that other instructions may get  
> executed in-between them.  What happens if the memory object is  
> freed by another thread in-between the comparison and the use of  
> %ptr?  Is that okay for your goals?

That's an interesting point that I didn't think about yet.

> o) If you're doing lookups, then objectsize can fail to find the  
> bounds because the %ptr is invalid or because %ptr could be pointing  
> to memory objects external to the code.  How do you tell if the  
> latter case is possible?

return (size_t)-1. If you don't know the object's size, just skip the  
check to avoid false positives.

> What you are proposing is a memory safety tool for production use;  
> it will be fast because it won't try to do expensive checks.  The  
> security policy it enforces seems reasonable.  You can build it the  
> way you describe, but I'd rather see it built with a more general  
> set of run-time checks that multiple tools can use.

Feel free to propose an alternative specification. Of course we're  
open to better designs.

Nuno