[LLVMdev] building whole-program bitcode with LLVM

Bram Adams bram.adams at ugent.be
Sat Aug 1 12:01:04 PDT 2009


Hi,

For my PhD work, I have used LLVM to transform whole-program bitcode  
modules of systems like Quake 3 and Parrot VM. As build system  
integration is a very complex problem in general, integrating LLVM in  
medium to large build systems was not straightforward, although I  
guess things should be easier now with the help of the gold plugin and  
libLTO.

In short, I was not able to find a fully automated, generic approach  
to integrate LLVM, as every build system is unique, and often contains  
subtle mistakes (invoking gcc directly instead of via $CC, ...).  
Instead, I used a tool-supported, manual approach consisting of the  
following 3 steps:
  1. Visualize and understand the existing build system
  2. Plan how my tool fits in
  3. Change the makefiles

In step 1, I used my MAKAO tool (http://users.ugent.be/~badams/makao/)  
to visualize the build dependency graph of a run of the existing build  
system. This gives an idea about all libraries and executables that  
are built, how they fit together and which makefile rules are  
responsible for them.

Based on the information of step 1, I then determined in step 2 which  
libraries and executables I wanted to transform.

Finally, step 3 involved making system-dependent physical changes to  
the build system in order to deploy my tools the way I planned to in  
step 2. Sometimes, this could be done without touching the original  
makefiles, e.g. by overriding build variables. Often, more invasive  
changes were needed, such as splitting existing build rules or adding  
new ones.

 From my experience, having a good understanding of the build system  
at hand (see step 1) is indispensable when doing this kind of build  
change in large systems. More information can be found in sections  
7.3.1, 9.3.1 and 10.3.1 of my PhD (http://users.ugent.be/~badams/publications/2008/PhD.pdf 
).

Kind regards,

Bram Adams
SAIL, Queen's University (Canada)

On 31-Jul-09, at 9:11 PM, Kelly, Terence P (HP Labs Researcher) wrote:

> Hi,
>
> Professor Adve suggested that we post this question to llvm-dev.
> Thanks in advance for your advice.
>
> My colleagues and I want to create whole-program bitcode for large
> real programs like Apache, BIND, OpenLDAP, etc.  We want the
> whole-program bitcode to include every part of the program for which
> we have source code.  For example, in the case of Apache's "httpd"
> server, we want to create a whole-program bitcode file "httpd.bc"
> containing functions that the default build system stashes in various
> application-specific auxiliary libraries (e.g., Apache's libapr and
> libaprutil).
>
> Our motive is *not* link-time optimization; we're interested in
> analyzing and modifying the whole-program bitcode in other ways.
> Once we have created a whole-program bitcode, we want to compile it
> to native assembly, then pass it thru the native assembler & linker
> to obtain a native executable whose behavior (except for performance)
> is identical to that of an executable obtained from the default build
> system.  We do *not* want standard libraries like libc and libpthread
> to be incorporated as bitcode in the whole-program bitcode; they can
> be linked in at the final step, after we have converted the
> whole-program bitcode to native assembly and assembled & linked it.
>
> We have been able to achieve our goal for small programs consisting
> of a handful of translation units, so we know that our goal is
> attainable in principle.  Problems start when we tackle big programs
> with complex build systems.  We want to find a generic strategy that
> works with most real world open source C/C++ programs without too
> much fuss, because we want to use it on at least a dozen different
> programs.  Ideally we want a strategy that works with unmodified
> default build systems, because eventually we hope to produce a tool
> that is easy for other developers to use.
>
> Initially we had hoped simply to replace gcc, as, ld, etc. with their
> LLVM counterparts in the standard build systems, but we haven't been
> able to make that strategy work.  Several different approaches along
> these lines fail in various ways.  Some have recommended the Gold
> plugin, but it's not clear from the documentation that it does what
> we want, and we haven't been successful in installing it yet.
>
> Does anyone have experience in constructing whole-program bitcodes
> that include app-specific libraries for large open-source programs?
> If you could share the right tricks, that would be very helpful.
>
> Thanks!
>
> -- Terence Kelly, HP Labs



More information about the llvm-dev mailing list