[LLVMdev] -fbounds-checking vs {SAFECode,ASan}

Fri May 25 09:20:28 PDT 2012

On 5/25/12 2:13 AM, Kostya Serebryany wrote:
>
>
> On Thu, May 24, 2012 at 9:23 PM, John Criswell <criswell at illinois.edu 
> <mailto:criswell at illinois.edu>> wrote:
>
>     On 5/24/12 5:41 AM, Duncan Sands wrote:
>     > Hi Kostya, I'm also curious to know where Nuno is going with
>     this, and the
>     > details of his design.  I'm worried he might be reinventing the
>     wheel.  I'm
>     > also worried that he may be inventing a square wheel :)
>
>     I believe Nuno's goal is to prevent run-time exploitation of software.
>
>
> If that's the goal, the solution is likely to be wrong.
> The proposed bounds-checking will cover a tiny portion of buffer 
> overflows and will not cover use-after-free or stack corruption at all.

While I agree that you're probably correct about the bounds checking 
covering a tiny portion of buffer overflows, I don't think we really 
know that for certain.  An experiment to find the percentage of pointer 
arithmetic operations that can be checked in this way would be interesting.

> If the documentation will say something like "it prevents run-time 
> exploitation", users may get a false sense of safety which will make 
> matter worse.

I agree that any future documentation on any added security attack 
mitigation features should be clear on which attacks are prevented and 
which are not.

Having a security solution that prevents some attacks but not others is 
okay.  Defeating all memory safety attacks with acceptable performance 
is still an open research question.  What I think is important is 
knowing which attacks a technique defeats and at what cost; you 
essentially want to know what you're buying and for how much.

My concern with Nuno's approach is that it is not clear which exploits 
it will prevent and which it will not.  Alternatively, if we implemented 
CFI, we know *exactly* which types of attacks are prevented and which 
are not (and, I think, we'd mitigate a large number of attacks).  I also 
suspect that the overhead of CFI may actually be lower than Nuno's 
proposed solution, although we'll need some experiments to be sure.

>
> Note that asan does not claim to "prevent run-time exploitation", 
> because in general case it does not.
> If we want full prevention, we need to use another kind of sandbox 
> (e.g. Native Client).

Actually, Native Client is a sandboxing technique: it ensures that a 
plugin does not accidentally or intentionally read or write data from a 
program's "core" (the "thing" that the plugin plugs into).  Native 
Client doesn't prevent attackers from taking over and controlling the 
behavior of the plugin, and it doesn't prevent direct attacks on the 
core.  Some of the techniques used in Native Client (and PNaCL in 
particular) could be of interest, though.

There are some memory safety techniques (or combinations of techniques) 
that could provide what I guess you would call full memory safety.  
However, they all either rely upon garbage collection (which is 
conservative for C) or dangling pointer detection (which is too 
expensive at run-time) (2).

I think the next best thing is SAFECode with its automatic pool 
allocation technique(1).  With automatic pool allocation, SAFECode can 
optimize away type-safe loads and stores while ensuring that danging 
pointer dereferences through type-safe pointers do not violate the 
memory safety guarantees.  It even gets you sound points-to analysis 
results, which I don't think any other technique (except those I listed 
above) gives you.

The challenge is that the automatic pool allocation transform and its 
prerequisite points-to analysis and type-inference analysis are 
relatively sophisticated pieces of code.  While I think the benefit is 
great, the required investment to make these pieces of code robust is 
not insignificant.

While I'd like LLVM to someday have a very strong memory safety attack 
mitigation feature, I also think that having a simpler mitigation 
technique that is fast and easier to implement is also valuable (as long 
as it's effective).  CFI appears to be a good candidate for these reasons.

-- John T.

(1) http://llvm.org/pubs/2006-06-12-PLDI-SAFECode.html
(2) One of these solutions is SoftBound + CETS which has been built into 
the SAFECode Clang compiler.

>
> --kcc
>
>     Nuno, please correct me if I'm wrong.
>
>     And with all due respect to Nuno, I think he's reinventing the
>     wheel.  I
>     implemented what he described using SAFECode in an evening by writing
>     two specialized passes that are needed to adjust SAFECode's
>     instrumentation to what Nuno needs (one pass removes checks that
>     are too
>     expensive; the other inlines the fast checks to remove function call
>     overhead).  That code can still be found at
>     http://sva.cs.illinois.edu/fastsc-llvm.tar.gz.
>
>     I wrote a proposal for a common memory safety instrumentation
>     infrastructure and sent it to llvm-commits.  Would it be useful to
>     send
>     it to llvmdev as well for discussion, or has everyone who's already
>     interested seen it?
>
>     Having said all this, if exploit mitigation is the goal, I think it
>     might be worth taking a step back and first determining *which* safety
>     properties one wants to enforce and what the expected overheads might
>     be.  IMHO, if I wanted a technique that could provide the most
>     security
>     for the least code complexity and least run-time overhead, I would
>     implement control-flow integrity (CFI).  As far as I understand,
>     nearly
>     all memory safety exploitation today is done by diverting
>     control-flow(*); CFI prevents that and is faster than any other
>     non-probabilistic mitigation in the literature.
>
>     There's a paper on CFI by Abadi et. al.
>     (http://dl.acm.org/citation.cfm?id=1609956.1609960).  However, I don't
>     think we'd want to implement it in the same way they do; I'd recommend
>     run-time checks on indirect function calls and a split-stack approach
>     that allows checks on stores to just mask off bits in the pointer
>     address to prevent them from overwriting the return address on the
>     stack.
>
>     As an aside, I have a web site called the Memory Safety Menagerie
>     (http://sva.cs.illinois.edu/menagerie/index.html) that lists papers on
>     the topic of memory safety attack mitigation.  Those interested in
>     exploring the mitigation options might find it useful.
>
>     -- John T.
>
>     (*) Attacks that only change data-flow are possible and practical,
>     but I
>     think these are a minority of attacks in the wild.  Attacks that
>     divert
>     control-flow are not only common, but researchers have now built tools
>     to automate the creation of such attacks.
>
>     _______________________________________________
>     LLVM Developers mailing list
>     LLVMdev at cs.uiuc.edu <mailto:LLVMdev at cs.uiuc.edu>
>     http://llvm.cs.uiuc.edu
>     http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20120525/0d721b26/attachment.html>