[LLVMdev] Adding diversity for security (and testing)

Thu Aug 29 11:38:12 PDT 2013

On Wed, Aug 28, 2013 at 10:31 PM, John Criswell <criswell at illinois.edu> wrote:
> On 8/28/13 4:37 PM, Nick Lewycky wrote:
>
> On 26 August 2013 11:39, Stephen Crane <sjcrane at uci.edu> wrote:
>>
>> Greetings LLVM Devs!
>>
>> I am a PhD student in the Secure Systems and Software Lab at UC
>> Irvine. We have been working on adding randomness into code generation
>> to create a diverse population of binaries. This diversity prevents
>> code-reuse attacks such as return-oriented-programming (ROP) by
>> denying the attacker information about the exact code layout. ROP has
>> been used is several high-profile recent attacks, and has also been
>> used as a jailbreaking avenue. We believe our transformations would
>> provide a significant security benefit for LLVM users who choose to
>> use diversity. For more details see [1] (although we are currently
>> proposing to upstream only a simplified subset of our work).
>>
>> We would like to contribute some of our work back to the community,
>> and are preparing a small patch adding two new features: NOP insertion
>> and schedule randomization. The NOP insertion pass randomly adds NOPs
>> after each MachineInstr according to a command-line
>> parameter. Currently NOP insertion is implemented for X86, and we are
>> adding support for ARM. The schedule randomizer randomly picks a valid
>> instruction to schedule at every point, bypassing the scheduling
>> heuristics. These passes result in a binary which, while slightly
>> slower, is far more secure against code-reuse attacks. In addition,
>> schedule randomization may be useful for randomized compiler and
>> micro-architecture testing.
>>
>> We would also include a secure random number generator which links
>> against OpenSSL. This would of course be an optional module disabled
>> by default, but is necessary so the randomization is cryptographically
>> secure and useful in security applications.
>>
>> We are in the process of writing test cases and double checking
>> formatting to produce a useful patch, but would like to solicit
>> feedback on our proposed changes before submitting patches for
>> detailed consideration.
>
>
> Thanks. This is really interesting -- it's hard to find good protections
> against ROP attacks, and this is promising. However, I'm going to start with
> the "bad news" first. I have a few questions as to whether this is useful
> for our (llvm's) users in the real world:
>
> 1. I'm concerned about the deployment problem. I realize that being in the
> compiler means you can transform the program in more exciting ways, but it
> gives you a much worse deployment story than something which modifies the
> program on disk like "prelink".
>
> 2. Does this actually fill a gap in our protections? How do we ever get into
> the situation where the user is able to deploy a ROP attack against us,
> without tripping asan or ubsan or something caught by our warnings or the
> static analyzer or any of the other protections offered by clang and llvm?
> It may suffice that there exists a niche which can't afford the performance
> penalty from asan or other things, but then we'll need to discuss what the
> performance impact is.
>
>
> Tools like asan and ubsan don't detect ever type of problem.  From my
> understanding of asan, it only performs checks on loads and stores and
> delays reuse of virtual address space for memory allocations, so it is not
> guaranteed to catch all use-after-free and array bounds errors.  To catch
> everything, you would need to use a tool like Softbound + CETS.  Even then,
> you need to exercise all the program paths which I suspect does not happen
> in practice.
>
> Static analysis tools don't find every possible memory safety bug, either.
> I believe it's theoretically impossible for them to do so.
>
> Run-time protections are attractive because they can guarantee that some bad
> behavior does not happen in any execution of the program.
>
>
>
> 3. Reproducible builds are a must. GCC has a -frandom-seed=text flag and you
> should use that here too. Given the same random seed, the compiler shall
> produce the same output.
>
> Ultimately my concern here derives from the fact that I do use clang
> warnings today and I do use asan and ubsan today. If this ROP-protection
> were added, I don't immediately see how I would use it. Just being I'm not
> inside the target audience doesn't mean the target audience doesn't exist,
> but I am asking for a little justification.
>
> And one issue for us, the llvm developers. If we're going to accept this
> feature, own it and maintain it, how can we test it? We know how to test for
> correctness and measure performance, we even understand what protections
> -fstack-protector or ASAN offer and can write spot-tests for them, but how
> can we measure the effectiveness of probability-based ROP-protections? We
> don't need an answer to this right now, but over time we'll be maintaining
> it, and it seems plausible that we could accept a patch which accidentally
> diminishes the actual security provided (image us maintaining a random
> number generator -- it's possible to add a weakness which isn't easily
> noticed).
>
>
>
>
>
>
> There's a "good news" side to this too. Over lunch I talked with one of our
> security guys, and he's excited. He tells me that diversity for
> ROP-protection is entirely the way to go, and speculates that it ought to be
> deployable by other vendors. Furthermore, we expect that rotating the
> register allocation is going to be especially painful on the exploit
> authors.
>
>
> Is there a reason why your security guy thinks that diversity is a better
> protection than control-flow integrity (CFI)?  Recent work on CFI [1]
> indicates that even very conservative call graphs remove nearly all gadgets
> from the list of valid function targets (hence they cannot be used).  That
> implementation has an average overhead of 3.6% and a max overhead of 8.6%.
> A more flexible implementation [2] that already works for LLVM has average
> overhead of 10%.  There's more recent work on CFI (including at least one
> paper that uses LLVM) published this month at Usenix Security, but I haven't
> had time to read it yet.
>
> Control-flow integrity should be effective in stopping ROP attacks, and we
> should be able to deploy it without using inter-procedural analysis
> (although inter-procedural analysis can improve the call graph results to
> make the program even more secure).
>
> I think diversity is a nice thing to have to provide defense in depth, but I
> currently think that CFI will provide the most bang for the buck.
>

As the aforementioned security guy, let me be clear: I'm excited about
CFI and ASAN too :).

IMO, there's enough room in the world for mitigations at both the
low-cost and the high-value ends of the scale. I don't have enough
data (or patches, hint hint) to figure out if this does what's on the
tin, but my hope is that it will turn out to be a low cost technique
to deploy in places where more comprehensive tools are considered
prohibitively expensive.

Geremy Condra

> -- John T.
>
> [1] http://www.cs.berkeley.edu/~dawnsong/papers/Oakland2013-CCFIR-CR.pdf
> [2] http://www.eecs.harvard.edu/~greg/papers/cfiDataSandboxing.pdf
>
>
> Looking forward to seeing patches!
>
> Nick
>
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
>
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>