[LLVMdev] Secure Virtual Machine

Fri Jun 15 10:20:04 PDT 2007

Let me cut it down to the core problem: I'm asking about the
feasibility of extending LLVM with constructs to manage separate
heaps. Given my current understanding of LLVM, I can see this done in
two ways:

1. Add heap management instructions to the core instructions, modify
allocation routines to explicitly name heaps or modify the runtime to
rebind the allocation routines depending on some VM-level context that
names a heap (thread-local storage?).

2. Add instrinsics to start a new heap (via a new ExecutionEngine?).
This would involve modifying the VM to accept allocation primitives as
function pointers.

So a program or language with real-time constraints where an
incremental GC is preferable, and where an efficient, non-incremental
GC is used for other tasks, can be expressed as partitioned heaps each
with their own GC.

Sandro

On 6/2/07, Sandro Magi <naasking at gmail.com> wrote:
> Many VMs focus on performance, optimizations, memory consumption, etc.
> but very few, if any, focus on fault isolation and security. Given
> memory safety, any VM reduces to capability security, which is
> sufficient to implement most security policies of interest; however,
> most such VMs still ignore two main attack vectors from malicious
> code: DoS attack on memory allocation, and DoS against the CPU.
>
> I've been mulling over how LLVM could be extended to provide a degree
> of isolation from these two attack vectors [3].
>
> Preventing a DoS against memory allocation involves controlling access
> to allocation in some way. Fine-grained control over every single
> allocation is likely infeasible [1]. Similarly, preventing a DoS
> against the CPU involves controlling the execution time of certain
> code blocks, by introducing concurrency or flow control of some sort.
>
> There is a single abstraction which has solved the above two problems
> for over 40 years: the process, which provides an isolated memory
> space, and an independently schedulable execution context.
>
> A VM process would run in its own heap and manages its own memory. The
> memory allocation routines are scoped to the process, which can itself
> potentially call out to a "space bank" to allocate more space for its
> heap. Memory faults in a process can be handled by "keepers" [4].
>
> Concurrency is still an open question, because a kernel thread per VM
> process is actually overkill. A mix of kernel threads and Erlang-style
> preemptive green threads might be optimal, but this isn't the
> interesting part of the proposal IMO.
>
> There must also be some sort of interprocess communication (IPC),
> either via copying between heaps, or an "exchange heap". The exchange
> heap is the approach taken by the Singularity OS [2] where they add
> "software isolated processes" to the .NET VM and make it an operating
> system.
>
> There are two approaches I currently foresee for adding process
> constructs to LLVM:
>
> 1. Add process management instructions to the core instructions, and
> modify the runtime to rebind the allocation routines depending on some
> VM-level context that names which process is actually executing
> (perhaps in thread-local storage).
>
> 2. Add instrinsics to launch an entirely new VM instance
> (ExecutionEngine?) as if it were the process. This would involve
> modifying the VM to accept allocation primitives as function pointers,
> and potentially adding some scheduling awareness.
>
> At the moment, I'm not primarily interested in making LLVM itself a
> secure VM, but I think that too might be possible, and suggests
> possible future work.
>
> For instance, unsafe pointer operations can be made safe if the
> casting operation from integer to pointer implements a dynamic check
> that it's within the bounds of the heap. This is potentially an
> expensive operation, but such casts only penalize heavily unsafe
> programs, which should hopefully be rare. I believe LLVM programs that
> do not use these casting instructions are inherently memory safe, so
> they incur no such penalties (please correct me if I'm wrong). Using
> this approach, LLVM could support the safe execution of unsafe
> programs by running them in an isolated VM process.
>
> Alternately, one could actually launch the unsafe code in a completely
> separate OS process with a new LLVM instance, and the VM-level IPC
> instructions would transparently perform OS-level IPC to the separate
> process. This maintains the isolation properties, with the full
> execution speed (no need for dynamic heap bound checks), at the cost
> of using slightly heavier OS processes.
>
> Any comments on the feasibility of this approach? I'm definitely not
> familiar with the LLVM internals, and I wrote the above given only my
> understanding from reading the LLVM reference manual.
>
> Sandro
>
> [1] except perhaps using some sort of region-based approach with
> region inference, etc. I'm still reading the literature on this.
> [2] http://research.microsoft.com/os/singularity/
> [3] I realize that LLVM is unsafe in other ways, but I believe it
> currently lacks even the base constructs necessary to even build a
> secure VM on top of it.
> [4] I can explain space banks and keepers concepts further, but just
> think of them as stateful exception handlers specific to a process.
> The concepts come from the KeyKOS/EROS and Coyotos secure operating
> systems.
>