[LLVMdev] [GSoC 2014] Using LLVM as a code-generation backend for Valgrind

Tue Feb 25 10:06:34 PST 2014

On 02/25/2014 04:50 PM, John Criswell wrote:
>
> I think a more interesting idea would be to use LLVM to perform
> instrumentation and then to use Valgrind to instrument third-party
> libraries linked into the program.
>
> What I'm imagining is this: Let's say you instrument a program with
> SAFECode or Asan to find memory safety errors.  When you run the program
> under Valgrind, the portion of the code instrumented by SAFECode or Asan
> runs natively without dynamic binary instrumentation because it's
> already been instrumented.  When the program calls uninstrumented code
> (e.g., code in a dynamic library), Valgrind starts dynamic binary
> instrumentation to do instrumentation.
>
> A really neat thing you could do with this is to share run-time data
> structures between the LLVM and Valgrind instrumentation.  For example,
> Valgrind could use SAFECode's meta-data on object allocations and
> vice-versa.
>

Someone proposed to cache the results of a JIT compilation. Caching LLVM 
bitcode is easy (and the LLVM optimizations operate on bitcode, so they 
don't need to be re-run on bitcode reload), and may be a good way to 
fasten Valgrind. Caching native binary code is more difficult and would 
only be useful if LLVM's codegen is slow (I think that the codegen can 
be configured to be fast, for instance by using a simpler register 
allocator).

If every .so is cached in a separate bitcode file, loading an 
application would only require the generation of bitcode for the 
application itself, not the libraries it uses, provided that they didn't 
change since another application using them was analyzed. That may speed 
up the start-up of Valgrind.