[LLVMdev] Instrumenting C/C++ programs
criswell at illinois.edu
Fri Sep 23 10:43:13 PDT 2011
On 9/23/11 12:24 PM, Himanshu Shekhar wrote:
> I just read that LLVM project could be used to do static analysis on
> C/C++ codes using the analyzer Clang which the front end of LLVM. I
> wanted to know if it is possible to extract all the accesses to
> memory(variables, local as well as global) in the source code using LLVM.
When doing analysis with Clang and LLVM, you first must make a choice
about which IR to use: Clang's Abstract Syntax Tree (AST) or LLVM's SSA
Intermediate Representation (IR). Clang takes source code and converts
it into an AST; it later takes the AST and converts it to LLVM IR. LLVM
then performs mid-level compiler analysis and optimization on code in
LLVM IR form and then translates from LLVM IR to native code.
Clang ASTs will give you much higher level information than LLVM IR. On
the other hand, LLVM IR is probably easier to work with and is
programming language agnostic.
You might want to read about the LLVM Language Reference Manual
(http://llvm.org/docs/LangRef.html) to get a feel of whether it is
suitable for your analysis. There may be a similar document for Clang,
but I'm not familiar with it since I haven't worked with Clang ASTs myself.
> Is there any inbuilt library present in LLVM which I could use to
> extract this information. If not please suggest me how to write
> functions to do the same.(existing source code, reference, tutorial,
It is easy to write an LLVM pass that plugs into the opt tool that
searches for explicit accesses to memory. The LLVM load and store
instructions access memory (similar to how loads and stores are used to
access memory in a RISC instruction set). That said, it is not clear
whether this is what you want to do. Some source-level variables are
translated into one or more SSA virtual registers, so you'll never see a
load or store to them (as they may never exist in memory but only in
registers). Additionally, some loads and stores to memory are not
visible at the LLVM IR level. For example, loads and stores to stack
spill slots are not visible at the LLVM IR level because they're only
created during code generation (and technically, they're generated in a
third IR called Machine Instructions that is used specifically for code
> Of what i studied is, I need to first convert the source code into
> LLVM IR and then make an instrumenting pass which would go over this
> bitcode file and insert calls to do the analysis, but don't know
> exactly how to do it.
The first thing you need to do is figure out which representation of the
program (Clang ASTs, LLVM IR, LLVM's code generation IR) is the best for
solving your particular problem. If you want, you can provide more
details on what you're trying to do; people on the list can then provide
feedback on which representation is most suitable for what you want to do.
If you decide to work with LLVM IR, I then recommend reading the "How to
Write an LLVM Pass" document
(http://llvm.org/docs/WritingAnLLVMPass.html) as well as the
Programmer's Guide (http://llvm.org/docs/ProgrammersManual.html).
Doxygen is also valuable (http://llvm.org/doxygen/).
For an example of a pass that adds run-time checks to LLVM IR loads and
stores, look at SAFECode's load/store instrumentation pass
It's about as simple as an instrumentation pass gets.
-- John T.
> Please suggest me how to go about it .
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the llvm-dev