[LLVMdev] LLVM instrumentation overhead

Fri Dec 9 12:05:12 PST 2011

On 12/9/11 1:32 PM, Nipun Arora wrote:
> Hi John,
>
> Thanks for the detailed answer, this gives me a good starting point to 
> look into.
>
> I was also wondering if you could give an idea (in terms of %ge) the 
> overhead one can expect with such an instrumentation. I want something 
> really lightweight and simple which can possible be applied to 
> production systems, so overhead is a concern.

I don't really know what the overhead would be (I'm terrible at guessing 
these things), but I imagine it would degrade performance sufficiently 
that at least some people would consider it too slow for production.

On a related note, we built a dynamic tracing tool called giri which 
records, in a file, the execution of functions and basic blocks.  The 
code is available from the llvm.org SVN repository 
(https://llvm.org/svn/llvm-project/giri/trunk) but is not actively 
maintained at present.

There are a few ideas from the Giri work that you may find useful:

1) We used mmap()  to map the log file into application memory instead 
of using the write() system call to write data to the log.  Using mmap() 
should improve performance because the OS doesn't have to copy the data 
between user-space and kernel-space; instead, when you unmap or msync 
the virtual page, the OS kernel just dumps the data to disk directly.

2) A significant issue with giri was controlling how much RAM the 
instrumentation used.  We opted to build giri so that it would mmap() 
part of the log file into memory, write it that memory, and then unmap 
that region of the log and map in the next.  Since RAM is always faster 
than disk, we found that if we let the OS sync the data to disk 
asynchronously whenever it wanted, we would exhaust memory and slow 
things down.  What we opted to do instead was to make the unmap 
synchronous, meaning that all data would be written to disk before 
proceeding to the next section of the log.  This made controlling the 
memory consumption easier.

Some other ideas:

1) Don't log function names; assign each function a numeric ID and log 
that ID.  That will reduce the amount of data you need to log during 
execution to a 32-bit number and the RDTSC value.

2) Consider using a helper thread to write data to disk.

3) You might be able to play some games using the call graph.  For 
example, if you know that function A, when called, will always call 
function B which will always call function C, then you only need to 
instrument function A instead of A, B, and C.

4) There was work at PLDI 2010 (IIRC) on creating hashes of the call 
stack (i.e., a single hash value could tell you the current function, 
its caller, the caller's caller, etc).  Utilizing this technique may 
reduce the number of instrumentation points in the program.

-- John T.

>
> Thanks
> Nipun
>
> On 12/09/2011 02:21 PM, John Criswell wrote:
>> On 12/7/11 4:51 PM, Nipun Arora wrote:
>>> Hi,
>>>
>>> I need to write a transform pass which instruments the target 
>>> program to
>>> output the name of each function executed, and the rdtsc counter along
>>> with it.
>>
>> Doing this in LLVM is really straightforward.  You simply iterate 
>> through all the functions in a module and add instructions to their 
>> entry basic blocks to do whatever it is that you want to do.
>>
>> I believe you already know how to find all the functions and their 
>> entry blocks.  Review the Programmer's Guide and the doxygen docs on 
>> llvm::Module and llvm::Function if there's something you don't 
>> understand.
>>
>> The only other question is how to insert instructions.  For that, you 
>> can take one of two approaches.  First, you can use the IRBuilder 
>> class (http://llvm.org/doxygen/classllvm_1_1IRBuilder.html).  Second, 
>> you can simply use the appropriate constructor/new methods of the 
>> Instruction classes to create and insert the instructions that you 
>> want.  I believe IRBuilder is now the preferred way to do things as 
>> its API changes less often.
>>
>> For the instrumentation that you want to do, the easiest thing to do 
>> would be to insert a call in every function to a function that you 
>> implement in a run-time library that does whatever the 
>> instrumentation should do.  This makes the compiler transform very 
>> simple.
>>>
>>> Can anyone give me an idea of how to go about it?(I've worked around
>>> with LLVM pass framework and opt to do static analysis, but would like
>>> to do a lightweight instrumentation). Also can anyone give an
>>> approximate idea of the overhead for such instrumentation?
>>
>> To make things faster, you could compile your run-time library as a 
>> static library linked in using clang's/libLTO's link-time 
>> optimization.  Your run-time library can then be inter-procedurally 
>> inlined with the program that you are instrumenting.
>>
>> -- John T.
>>
>>>
>>> Thanks
>>> Nipun
>>>
>>>
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>
>