[LLVMdev] Instrumenting C/C++ programs

xiaoming gu xiaoming.gu at gmail.com
Tue Sep 27 18:19:44 PDT 2011


Hi, Himanshu. I once wrote an LLVM IR-based memory profiling
pass. Basically, I followed the code for EdgeProfiling. The source code is
enclosed here, which worked with LLVM 2.8. Hope it is helpful.

    MemoryProfiling.cpp---the instrumentation pass, which inserts profiling
function calls into the original program
    MemoryProfiling.c---the profiling library containing the profiling calls
    llvm-memory-profiling.patch---the other modifications
    notes.txt---some information collected when I was working on this
profiling pass

Xiaoming

On Tue, Sep 27, 2011 at 7:13 PM, Himanshu Shekhar <imhimanshu91 at gmail.com>wrote:

> Hey John,
> Thank you for the detailed reply.
> I tried to figure out myself which IR should I use for my purpose ( Clang's
> Abstract Syntax Tree (AST) or LLVM's SSA Intermediate Representation (IR).
> ), but couldn't really figure out which one to use.
> Here is what I m trying to do.
> Given any C/C++ program (like the one given below), I am trying to insert
> calls to some function, before and after *every instruction that
> reads/writes to/from memory*. For example consider the below C++  program
> ( Account.cpp)
> /***********************************************************/
>
> #include <stdio.h>
>
> class Account {
>   int balance;
>
> public:
>   Account(int b)
>  {
>     balance = b;
>   }
>   ~Account(){ }
>
>   int read() {
>     int r;
>     r = balance;
>     return r;
>   }
>
>   void deposit(int n) {
>       balance = balance + n;
>   }
>
>   void withdraw(int n) {
>       int r = read();
>       balance = r - n;
>   }
> };
>
> int main (){
>   Account* a = new Account(10);
>   a->deposit(1);
>   a->withdraw(2);
>   delete a;
> }
>
> /***********************************************************/
> So after the instrumentation my program should look like :
>
> /***********************************************************/
>
> #include <stdio.h>
>
> class Account {
>   int balance;
>
> public:
>   Account(int b)
>  {
>     balance = b;
>   }
>   ~Account(){ }
>
>   int read() {
>     int r;
>     foo();
>     r = balance;
>     foo();
>     return r;
>   }
>
>   void deposit(int n) {
>       foo();
>       balance = balance + n;
>       foo();
>   }
>
>   void withdraw(int n) {
>       foo();
>       int r = read();
>       foo();
>       foo();
>       balance = r - n;
>       foo();
>   }
> };
>
> int main (){
>   Account* a = new Account(10);
>   a->deposit(1);
>   a->withdraw(2);
>   delete a;
> }
>
> /***********************************************************/
> where *foo() *may be any function like get the current system time or
> increment a counter .. so on. I understand that to insert function like
> above I will have to first get the IR and then run an instrumentation pass
> on the IR which will insert such calls into the IR, but I don't really know
> how to achieve it. Please suggest me with examples how to go about it.
> Also I understand that once I compile the program into the IR, it would be
> really difficult to get 1:1 mapping between my original program and the
> instrumented IR. So, is it possible to reflect the changes made in the IR (
> because of instrumentation ) into the original program.
>
> In order to get started with LLVM pass and how to make one on my own, I
> looked at an example of a pass that adds run-time checks to LLVM IR loads
> and stores, the SAFECode's load/store instrumentation pass (
> http://llvm.org/viewvc/llvm-project/safecode/trunk/include/safecode/LoadStoreChecks.h?view=markupand
> http://llvm.org/viewvc/llvm-project/safecode/trunk/lib/InsertPoolChecks/LoadStoreChecks.cpp?view=markup).
> But I couldn't figure out how to run this pass. Please give me steps how to
> run this pass on some program say the above Account.cpp.
>
> Thanks,
> Himanshu
>
>
>
>
> On Fri, Sep 23, 2011 at 11:13 PM, John Criswell <criswell at illinois.edu>wrote:
>
>>  On 9/23/11 12:24 PM, Himanshu Shekhar wrote:
>>
>> I just  read that LLVM project could be used to do static analysis on
>> C/C++ codes using the analyzer Clang which the front end of LLVM. I wanted
>> to know if it is possible to extract all the accesses to memory(variables,
>> local as well as global) in the source code using LLVM.
>>
>>
>> When doing analysis with Clang and LLVM, you first must make a choice
>> about which IR to use: Clang's Abstract Syntax Tree (AST) or LLVM's SSA
>> Intermediate Representation (IR).  Clang takes source code and converts it
>> into an AST; it later takes the AST and converts it to LLVM IR.  LLVM then
>> performs mid-level compiler analysis and optimization on code in LLVM IR
>> form and then translates from LLVM IR to native code.
>>
>> Clang ASTs will give you much higher level information than LLVM IR.  On
>> the other hand, LLVM IR is probably easier to work with and is programming
>> language agnostic.
>>
>> You might want to read about the LLVM Language Reference Manual (
>> http://llvm.org/docs/LangRef.html) to get a feel of whether it is
>> suitable for your analysis.  There may be a similar document for Clang, but
>> I'm not familiar with it since I haven't worked with Clang ASTs myself.
>>
>>
>> Is there any inbuilt library present in LLVM which I could use to extract
>> this information. If not please suggest me how to write functions to do the
>> same.(existing source code, reference, tutorial, example...)
>>
>>
>> It is easy to write an LLVM pass that plugs into the opt tool that
>> searches for explicit accesses to memory.  The LLVM load and store
>> instructions access memory (similar to how loads and stores are used to
>> access memory in a RISC instruction set).  That said, it is not clear
>> whether this is what you want to do.  Some source-level variables are
>> translated into one or more SSA virtual registers, so you'll never see a
>> load or store to them (as they may never exist in memory but only in
>> registers).  Additionally, some loads and stores to memory are not visible
>> at the LLVM IR level.  For example, loads and stores to stack spill slots
>> are not visible at the LLVM IR level because they're only created during
>> code generation (and technically, they're generated in a third IR called
>> Machine Instructions that is used specifically for code generation).
>>
>>
>>
>> Of what i studied is, I need to first convert the source code into LLVM IR
>> and then make an instrumenting pass which would go over this bitcode file
>> and insert calls to do the analysis, but don't know exactly how to do it.
>>
>>
>> The first thing you need to do is figure out which representation of the
>> program (Clang ASTs, LLVM IR, LLVM's code generation IR) is the best for
>> solving your particular problem.  If you want, you can provide more details
>> on what you're trying to do; people on the list can then provide feedback on
>> which representation is most suitable for what you want to do.
>>
>> If you decide to work with LLVM IR, I then recommend reading the "How to
>> Write an LLVM Pass" document (http://llvm.org/docs/WritingAnLLVMPass.html)
>> as well as the Programmer's Guide (
>> http://llvm.org/docs/ProgrammersManual.html).  Doxygen is also valuable (
>> http://llvm.org/doxygen/).
>>
>> For an example of a pass that adds run-time checks to LLVM IR loads and
>> stores, look at SAFECode's load/store instrumentation pass (
>> http://llvm.org/viewvc/llvm-project/safecode/trunk/include/safecode/LoadStoreChecks.h?view=markupand
>> http://llvm.org/viewvc/llvm-project/safecode/trunk/lib/InsertPoolChecks/LoadStoreChecks.cpp?view=markup).
>> It's about as simple as an instrumentation pass gets.
>>
>> -- John T.
>>
>>
>> Please suggest me how to go about it .
>> thanks
>> himanshu
>> --
>>
>>
>>
>>
>> _______________________________________________
>> LLVM Developers mailing listLLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.eduhttp://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>
>>
>>
>
>
>
>
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20110927/04f168cf/attachment.html>
-------------- next part --------------
*study LLVM and add a memory access profiling pass into it

**how to write a pass
http://llvm.org/docs/WritingAnLLVMPass.html

**add a memory profiling pass to llvm
1. copy lib/Transform/Instrumentation/EdgeProfiling.cpp to
lib/Transform/Instrumentation/MemoryProfiling.cpp
2. edit MemoryProfiling.cpp to adapt to the new pass
3. add a line to include/llvm/LinkAllPasses.h 
4. add a line to include/llvm/Transforms/Instrumentation.h

**the compilation process of llvm
llvmc is the driver calling the following steps
  1. llvm-gcc/llvm-g++/llvm-gfortran
     frontend
     C/C++/Fortran => .ll => .bc
  2. opt (use -opt option)
     language-independent and machine-independent transformations
     .bc => .bc
  3. llc
     code generator
     .bc => .s
  4. as
     assembler
     .s => .o
  5. ld
     linker
     .o => executable

**call the memory profiling pass
llvmc -opt -Wo,=-insert-memory-profiling xxx.c

**The position of edge profiling pass with "-insert-edge-profiling"
Case 1: "-O3 -insert-edge-profiling" (USE THIS WAY!!!)
     The separate function-level passes at the beginning are
     bypassed. And the module pass "Edge Profiler" is called almost at
     the end, just before the last the "Function Pass Manager" pass.
Case 2: "-insert-edge-profiling -O3"
     The separate function-level passes at the beginning are
     remained.  The module pass "Edge Profiler" is the first one of
     the module-level passes.

**llvm edge profiling related stuffs
lib/Transforms/Instrumentation/ => Debug/lib/libLLVMInstrumentation.a
lib/libprofile/ => Debug/lib/profile_rt.dylib
add "BUILD_ARCHIVE = 1" to runtime/libprofile/Makefile
lib/libprofile/ => Debug/lib/profile_rt.a

**llvm memory profiling
llvmc -v -O3 -opt -Wo,=-insert-memory-profiling -Wl,=/Users/xiaoming/Work/llvm/llvm-2.7/Debug/lib/profile_rt.a -o SOR SOR.c

**change to use gold plugin for LLVM
1. build binutils gold
   a) ./configure --prefix=/home/vax6/p28/compiler2/xiaoming/INSTALL/binutils --enable-gold=both/gold --enable-lto --enable-plugins --enable-build-with-cxx
   b) make;make install
   c) the install ld is gold
2. build llvm-2.7
   a) ./configure --with-binutils-include=/home/vax6/p28/compiler2/xiaoming/binutils-2.20.51/include --prefix=/home/vax6/p28/compiler2/xiaoming/INSTALL/llvm-2.7 --enable-optimized
   b) make;make install
   c) the built gold plugin is /home/vax6/p28/compiler2/xiaoming/LLVM/llvm-2.7/build/Release/lib/libLLVMgold.so
3. make a software link to the gold plugin in /home/vax6/p28/compiler2/xiaoming/LLVM/llvm-gcc-4.2-2.7-i686-linux/libexec/gcc/i686-pc-linux-gnu/4.2.1, which is a sub-directory of gcc front end for llvm
-------------- next part --------------
A non-text attachment was scrubbed...
Name: llvm-memory-profiling.patch
Type: application/octet-stream
Size: 3513 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20110927/04f168cf/attachment.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: MemoryProfiling.c
Type: text/x-csrc
Size: 2207 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20110927/04f168cf/attachment.c>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: MemoryProfiling.cpp
Type: text/x-c++src
Size: 4634 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20110927/04f168cf/attachment.cpp>


More information about the llvm-dev mailing list