[LLVMdev] secure virtual architecture / safecode

Wed Mar 25 20:35:43 PDT 2009

Hi Vikram,

I think it's worth continuing to discuss this on-list.  I'm interested in 
different kinds of embedded software, but specifically in TinyOS
applications.

> We (more accurately, John Criswell and Brice Lin) are working on a
> debugging version of SAFECode right now, which should be robust enough
> to play with soon.

This is great to hear.

I know the SAFECode paper and like it.  However, my intuition fails when I 
try to imagine what your transformations will do to a TinyOS
application.

These applications contain some substantially ugly pointer manipulation in 
the network stack.  The applications will clearly violate some
of the assumptions from your 2005 paper.  On the other hand, these apps 
are definitely no worse than the Linux kernel.  They are
substantially more static in nature, containing few or no function 
pointers and little or no heap allocation.

Our Safe TinyOS work (memory safety via Deputy) suffers from code size 
problems.  Some architectures, such as msp430, are quite flash-poor,
and so even a few percent code bloat will stop large applications from 
compiling.  This is too bad since these large applications are the
ones thst most need safety.  Since we have deployed Safe TinyOS as part of 
a recent release of TinyOS, the problem is not just a
theoretical one! Making things worse, msp430-gcc is not one of the 
stronger gcc ports.

So I am very interested in approaches that can reduce or eliminate dynamic 
checks.  However, this legacy code base is not going to lend
itself to heavy-handed restrictions.  I have been expecting that dragging 
a heavyweight decision procedure into the bounds check optimizer
will be necessary.

More broadly, I'm intersted in exploring whole-program compilation of 
embedded codes using LLVM.  The transformations that are most
interesting to me are those that do not make sense (and hence, have not 
been explored deeply) by the general-purpose compiler community.  A
good example is getting rid of the call stack (we have an upcoming LCTES 
paper about this).  Clearly it's stupid to do this for Firefox or
MSWord, but it can win for microcontroller codes.  Our prototype was 
partially via source-to-source transformation, partially via gcc
hacks.  Overall this was not very fun-- we should have done the work in 
LLVM (but nobody wrote AVR or msp430 backends for us to use :).

John Regehr