[LLVMdev] Automating Diagnostic Instrumentation

Thu Apr 8 03:25:01 PDT 2004

Hi Reid,

Reid Spencer wrote:

>With the above kind of instrumentation in mind, I consider such 
>auto-instrumentation as "just another pass" in LLVM. That is, I believe 
>LLVM makes it pretty easy to do the call graph analysis, find the 
>"fan-out" points, and insert the instrumentation code.  So, given that 
>its relatively easy to do this, I have the following questions for the
>list:
>  
>
First, thanks for your brilliant summary of instrumentation for 
performance analysis. I am currently working on program structure 
analysis for regression test selection and minimization, and found many 
similar points to what you describe.

>(1) Would others find this kind of diagnostic instrumentation useful?
>  
>
I would: it occurs that whenever a program grows big, it is crucial to 
get measures from selected indicators, at least to identify performance 
bottlenecks or do coverage analysis (which parts of the code were used 
during a program execution). Instrumentation is a cornerstone for 
properly testing a program, without it you can't get your test coverage 
and thus lack a fundamental information to properly select which tests 
are relevant to (re)-validate your program.

>(2) Would it be useful to turn the call graph data ino a pretty picture 
>via graphviz (both statically and dynamically) ?
>  
>
It may, but it may even be more interesting to output the call graph to 
a database which can be later processed. For instance, the 
gcc-introspector projet (http://introspector.sf.net) outputs to an RDF 
database, from which many things can be done. For some programs (if not 
many), the call graphs will be too complex.You can see some "static call 
graphs" (resulting from static analysis) of a GCC source file I made 
here <http://people.type-z.org/seb/bordel/sched.png>. And this is only a 
one-level call graph of one source file.

>(snip)
> This implies
>that I need the instrumentation pass to understand some things about the
>source-level language. and possibly capture information about the 
>environment the instrumentation application will run in.  Unfortunately,
>that means that the pass would either become language specific or 
>environment specific because I would have to ensure that the source 
>language compiler added the necessary constructs to the generated code 
>to provide the contextual information needed by the pass.  This raises a
>couple questions more questions:
>  
>
What you are talking about is very similar to what I would like to do : 
get structural information (list of functions, what do they call, lists 
of classes, etc). This means accessing and querying a program structure 
through an API.  There was a post yesterday on LLVM/OpenC++, which is an 
attemp to offer C++ developers with an API to reflect a C++ program 
structure (they call it "meta-programming", but to me it seems mostly 
like reflectivity).

>(1) Is there be a general mechanism to communicate information between 
>source language compiler and LLVM pass? If there isn't, should there 
>be?  In the case I describe above it would be *highly* useful (IMO) to 
>have the source language compiler provide source level information for a
>language independent pass to use later.
>  
>
I also think tt would be highly useful to have this information, at 
least for the following use cases :

* Add instrumentation operations at selected program points
* Allow program structural-analysis passes (which can detect for 
instance bad design in program structure, or optimise recurrent 
structural patterns)
* Introspect a program to automatically generate bindings to other 
language. Language X for LLVM could easily used libraries programmed in 
language Y for LLVM, because all program information is accessible, and 
has not to be generated using external utilites (like SWIG, for instance).

I must say that providing a compiler infrastructure that offers even 
basic reflectivity infrastructure for all front-ends would really ease 
the weaving of different programs. For instance, scripting languages 
like Python or Ruby (or anything dynamic) would have almost free 
bindings to any C or C++ library.

I hope this is possible, and am ready to help !

 -- Sébastien