[LLVMdev] summer of code idea — checking bounds overflow bugs
John Criswell
criswell at uiuc.edu
Tue Mar 30 21:07:09 PDT 2010
罗勇刚(Yonggang Luo) wrote:
> Sounds an good idea, is that means lowerinng down the SAFECode project
> from the higher level(clang)to lower level for an more general work on
> bound check?
SAFECode has always worked on the LLVM IR.
What I am saying is that my preference is to have LLVM passes that do
static array bounds checking instead of Clang passes that do static
array bounds checking. The problem that I see with implementing static
array bounds checking in Clang is that it benefits only languages
utilizing Clang's libraries. That means that VMKit, llvm-gcc/g++, and
other potential frontends can't benefit from it. SAFECode won't derive
any benefit except when it is used in conjunction with Clang. That's
okay but not ideal.
Also, SAFECode, being a set of LLVM passes, uses LLVM passes better than
Clang passes. If static array bounds checking were implemented in
Clang, then a Clang-based transform would need to insert information
into the LLVM IR to communicate to SAFECode which GEP instructions
stayed within bounds. If static array bounds checking is implemented as
an LLVM pass, then SAFECode will just need to add it as a prerequisite
and query the results.
Now, having said that, static array bounds checking in Clang is probably
a very good thing for the Clang static analyzer, and having strong
static analysis tools for finding bugs is a good thing, so if anyone
wants to build static array bounds checking for Clang, go for it.
However, I can't mentor such a project (I have no experience with Clang
analyses), and it won't benefit my project (SAFECode) very easily.
> I aslo want to know is it possoble to detecting memory
> leak at the very low(llvm ir) level to detecting memory leaks?
I don't see why not. I believe Valgrind does it on assembly code; you
could probably build an LLVM transform that does what Valgrind does but
does it more efficiently (primarily because using LLVM as a static
compiler removes the dynamic binary translation overhead).
> Or at
> llvm ir level to providing an stackfull hooks? It's very useful to
> have such an feature. The stack hooks can help us to print extra stack
> info in the exec period without modify the original code, to help us
> to find bugs easier:)
>
I'm not sure what you mean here. Can you clarify?
-- John T.
P.S. I use the term "Clang IR" to mean whatever data structures Clang
uses to represent code. I believe it uses Abstract Syntax Trees
(ASTs). Perhaps I should have said ASTs...
More information about the llvm-dev
mailing list