[LLVMdev] Adding diversity for security (and testing)

John Criswell criswell at illinois.edu
Tue Sep 10 08:19:53 PDT 2013

On 9/9/13 5:24 PM, Nick Lewycky wrote:
> [snip].
>     Briefly looking at ASAN again, I saw a performance penalty of 2x
>     mentioned. Diversity could act as both defense in depth, and as a
>     lower-impact defense for performance critical code.
> Okay, I've been thinking about what I wanted answered here, and I've 
> decided that what I want to know is too complex for this discussion. 
> It boils down to: given we have all the power of clang and llvm for 
> very complex analysis, both static and dynamic, why is randomizing the 
> best we can do? Can't we somehow use all that static and dynamic 
> analysis to shrink the problem down to something we can solve more 
> cleverly than randomizing the program across a lot of axes, such as 
> proving (or arranging for) certain properties which reduce how much we 
> need to randomize, etc? And if not *why can't we*? It's that last part 
> which I think is hardest to answer, so I've decided I'll leave this to 
> the security-trained folks -- if they think this is the right 
> approach, they're probably right.

You are right that, with LLVM and Clang's facilities, we can do better.  
However, in my opinion, the cost of many of these solutions is still too 
high (both in terms of development cost and run-time performance) for 
industry to accept them.  I am also of the opinion that many people do 
not understand the options that are available, partially because the 
work on the subject is spread across 4 different communities (compiler, 
operating systems, security, and software engineering).

Automatic, very strong memory safety guarantees are hard to enforce on 
existing C code with good efficiency.  It can be done, but it requires 
sophisticated analyses (e.g., type-inferencing, whole-program points-to 
analysis) and, therefore, requires a decent level of expertise and 
incurs significant developer cost.  Industry is also very picky about 
run-time performance; Vikram and I were told once by a bay-area company 
that they would only use a security solution if it added 0% run-time 
overhead.  Such performance goals will kill just about any solution.

Enforcing a weaker property like control-flow integrity looks very 
promising both in terms of development cost,  performance overhead, and 
security.  I'm not sure why it doesn't get more attention. Perhaps no 
one is yet willing to work out the remaining compatibility issues with 
native code libraries, or maybe industry wants someone else to build an 
open-source implementation first before committing to further 
development, or maybe people simply are unaware of what it is.  I'm 
curious to know.

If you're interested in knowing more about the topic, take a look at the 
Memory Safety Menagerie at http://sva.cs.illinois.edu/menagerie.  I need 
to update it since a few new papers have come out, but it provides a 
good number of papers on the subject (including quite a few that use LLVM).

My two cents.

-- John T.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130910/73b075bf/attachment.html>

More information about the llvm-dev mailing list