[LLVMdev] Dynamic Profiling - Instrumentation basic query
alastairmurray42 at gmail.com
Sun Jan 13 14:58:40 PST 2013
Firstly: Do you really need to do this? You can't just get a memory
access trace from a simulator? Or dcache access/miss rates from
Assuming this really is required:
On 12/01/13 23:28, Silky Arora wrote:
> I am new to LLVM, and would like to write a dynamic profiler, say which prints out the load address of all the load instructions encountered in a program.
>>From what I could pick up, edge-profiler.cpp increments a counter dynamically which is somehow dumped onto llvmprof.out by profile.pl
profile.pl is just a wrapper. 'EdgeProfiling.cpp' inserts calls to
instrumentation functions on control flow edges. libprofile_rt.so
provides these functions (code is in runtime/libprofile). It is the
functions in libprofile_rt.so that dump the counters to llvmprof.out.
Note: libprofile_rt.so be named differently on some platforms.
> Could anyone explain me how this works? Can I instrument the code to dump out the load addresses or other such information to a file?
Yes, you can do this, though the current profiling code does not profile
load addresses at all. The quickest way to get this working is probably to:
A) add a new pass to insert the load profiling instrumentation,
EdgeProfiling.cpp should provide a good start point to copy from. You
just need to modify the IR to insert a call to some function, say
'llvm_profile_load_address', and in the IR pass this function the load
address as an argument.
B) Add an implementation of 'llvm_profile_load_address' to
runtime/libprofile (don't forget to add the symbol to
libprofile.exports). Perhaps it just does 'fprintf(file, "%p\n", addr);'.
C) If your implementation of 'llvm_profile_load_address' requires
initialisation (such as an fopen) add a call to an
'llvm_profile_load_start' in the IR for 'main' and
'llvm_profile_load_end' can be provided to atexit(). EdgeProfiling.cpp
does this, so you can look at that code to see how it works.
This will slow down your code a lot. Maybe even by 100x. Faster
implementations are of course possible, but loads are very common, so
any extra work will slow things down a lot.
Also, once all these calls to external functions have been added the
optimiser will likely be severely hindered. So to get realistic results
the load profiling instrumentation pass should probably happen as late
More information about the llvm-dev