[llvm-commits] change objectsize signature

Mon May 7 18:56:05 PDT 2012

On 5/7/12 7:24 PM, Nuno Lopes wrote:
> Hi,
>
> Please find in attach a patch to change the objectsize intrinsic's 
> signature.
> My proposal is to add a third parameter to control whether objectsize 
> is allowed to perform checks at run-time or not.
> This parameter is an integer, and a higher value indicates that you're 
> willing to accept a potentially higher run-time performance penalty, 
> while 0 means no work at run-time (the current behavior).
>
> The idea is to use this intrinsic, for example, for array bound checking.
>
> Nothing is changed yet in objectsize: this patch only changes the 
> signature of the intrinsic and implements an auto-upgrade.
>
> Comments, ideas, etc..?

Before I begin, I want to apologize for the lengthy reply.  However, 
I've been working on memory safety checks for a long time.
:)

My initial impression is that I don't think this is the right approach.  
The objectsize instruction, in your design, lacks information that is 
useful for optimizing run-time checks and making them more stringent 
with link-time optimization.

First, your design does not distinguish between checks on 
loads/stores/atomics and checks on GEPs.  The problem with that is that 
some memory safety systems treat these checks differently.  Many systems 
(SoftBound and SAFECode being just two) will allow GEPs to generate 
out-of-bounds pointers so long as they are not deferenced.  This means 
that a GEP check and a load/store check have different behavior when 
they fail.  This requires both different implementations of the checks 
as well as different rules of when and how they can be optimized.

Second, load/store/atomic checks need the size of the memory access as 
well as its starting pointer to make sure that the load/store doesn't 
"fall off the end" of a memory object.  Your objectsize design does not 
provide that information.

Third, your design does not specify whether a check is on a pointer 
which is only manipulated by code internal to the program or whether the 
pointer can be manipulated by or returned from external code.  A usable 
memory safety system needs to know the difference; memory safety 
guarantees need to be relaxed for pointers handled by external library 
code.  Otherwise, the application may exhibit false positives during 
execution.

Additionally, whether a check is complete (because it checks a pointer 
handled by only internal code) or is incomplete (because it checks a 
pointer that can be manipulated by external code) needs to be 
communicated at the LLVM IR level.  This is because other LLVM IR-level 
analyses and transforms can be used to change incomplete checks to 
complete checks.  Your design currently leaves that problem to the code 
generator, making it more difficult for many LLVM developers to write 
incomplete to complete check transformations.

In short, if you want to build a generic infrastructure for memory 
safety run-time checks, I strongly recommend that you start with the 
work we did on SAFECode; we've dealt with these issues, and we have a 
solution that we believe can be reused for memory safety tools other 
than our own and potentially by safe language implementations as well.  
As an FYI, a GSoC project that uses our lessons from SAFECode to make 
generic instrumentation passes for memory safety has been accepted and 
will be led by Kostya from Google and co-mentored by me (if I understand 
the arrangement correctly).

For the GSoC project, I should probably write up a document on the 
generic run-time checks and what they do.  Would you find this document 
useful?

-- John T.

> Thanks,
> Nuno
>
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20120507/322c8561/attachment.html>