[LLVMdev] Type-based analysis for LLVM IR
John Criswell
criswell at illinois.edu
Wed Apr 3 08:31:07 PDT 2013
On 4/3/13 8:21 AM, Marcelo Sousa wrote:
> I am interested in type-based verification of LLVM IR in the areas of
> certified compilation and software
Can you explain what you mean by type-based verification? Do you mean
that you want to use a set of typing rules to perform verification, or
do you mean something else?
> verification. It seems to me that the LLVM IR type system is rather
> informal in the sense that there is no paper with a proper
> formalization of the type rules, and for example, a proof of soundness
> for well-formedness of the code.
Another important feature of the LLVM type system is that it is not a
"safe" type system. It is possible to do all sorts of type unsafe
things, such as casting integers to pointers and dereferencing them.
One of the standard LLVM transforms (load widening) will generate loads
that access memory past the end of a memory object. LLVM IR is capable
of representing all the unsafe computations that C can do.
> I would like to know if you are aware of any work in this direction
> for LLVM IR so that I can compare with my own implementation.
The SAFECode/SVA work uses a points-to analysis called DSA which infers
the types of memory objects. A key result of the work is that the
points-to analysis/type inference results of DSA can be made sound at
run-time. The original version of DSA, working on older versions of
LLVM, was able to find a fairly decent number of type-safe memory
objects. The SAFECode/SVA work developed a simple type checker that
verified that the type-inference results returned by DSA are consistent
(i.e., if DSA says that a memory object is used in a type-safe manner,
then it is actually used in a type-safe manner). There is also some
work by Greg Morrisett's group at Harvard at creating a verified
type-checker for the SAFECode/SVA system. SAFECode/SVA papers can be
found at http://sva.cs.illinois.edu/pubs.html.
As far as formalisms for the LLVM IR, there is one from University of
Pennsylvania (Santosh Nagarakatte, now at Rutgers, was involved with
that work), the one used by Greg Morrisett's group for the verified type
checker, and there might be one from Grigore Rosu's group here at the
University of Illinois. There is also an LLVM translation validator
from Greg Morrisett's group (http://llvm-md.org).
I think another person wrote asking about translation validation a few
months ago. You might want to search the llvmdev archives for that
conversation.
> Moreover, it would be valuable to get some feedback on suitable
> type-based analysis that the LLVM community would like to have in the
> infrastructure.
LLVM already as a type-based alias analysis (called TBAA) which is used
for optimization. There's a proposal floating around on the list to
improve it; you might want to take a look at that.
SAFECode can use type-safety to optimize away load/store checks
(provided that it uses garbage collection or automatic pool allocation
to make the optimization sound). Using TBAA instead of DSA might
generate better results. Address Sanitizer (ASan) might benefit from
that optimization as well.
Automatic pool allocation could also benefit from something like TBAA.
I think you'll have to be more specific, though, on what you mean by
type-based analysis and provide details on how it could be applied.
I apologize if my response is a bit of a ramble; I'm basically providing
a bit of information hoping that some of it is relevant to what you are
asking. I hope you find some of it useful.
-- John T.
>
> Regards,
> Marcelo
>
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130403/c8c88793/attachment.html>
More information about the llvm-dev
mailing list