[LLVMdev] strace for whole-program bitcodes (was: RE: building whole-program bitcode with LLVM)

Kelly, Terence P (HP Labs Researcher) terence.p.kelly at hp.com
Thu Oct 15 16:52:33 PDT 2009


Hi Daniel,

Thanks for your reply.

Do we know if the LLVM developers intend to
address this problem in a comprehensive way?
The existing LLVM tools are not quite drop-in
replacements for their standard GCC counterparts;
that's the source of the problems that various
people have encountered when trying to develop
a fully general way to get whole-program bitcodes.

If the LLVM tools *were* fully compatible, I
think that would remove an impediment to much
wider usage of LLVM.  Is full compatibility a
goal for the LLVM developers?

-- Terence

> -----Original Message-----
> From: daniel.dunbar at gmail.com 
> [mailto:daniel.dunbar at gmail.com] On Behalf Of Daniel Dunbar
> Sent: Thursday, October 15, 2009 8:13 AM
> To: Kelly, Terence P (HP Labs Researcher)
> Cc: llvmdev at cs.uiuc.edu
> Subject: Re: [LLVMdev] strace for whole-program bitcodes 
> (was: RE: building whole-program bitcode with LLVM)
> 
> Hi Terence,
> 
> I believe that this is in fact similar to an approach Coverity uses
> (or used at one time) as a robust solution to determine what was done
> during a build. I can imagine that one can build a robust system
> following this technique, but it also seems like it might be quite a
> bit of work.
> 
> Another possible alternative not mentioned is to teach the compiler
> driver (clang, most likely) to understand how to deal with bitcode
> files on platforms with no LLVM linker support. This isn't terribly
> difficult, and would work as long as all access to the tools was done
> through the driver (e.g., CC). There might still be problems with
> build systems that call tools like ar/ld directly.
> 
>  - Daniel
> 
> On Thu, Oct 8, 2009 at 3:26 PM, Kelly, Terence P (HP Labs Researcher)
> <terence.p.kelly at hp.com> wrote:
> >
> > Hi,
> >
> > It would be nice if it were easier for relative
> > novices to build whole-program bitcodes for
> > large, complex applications with hairy build
> > systems.  Several readers of this list have
> > been trying various approaches for a few months
> > but as far as I know we haven't yet found a
> > good general solution.  Approaches that have
> > been tried include 1) placing wrappers for the
> > usual tools (gcc, ar, as, ld, etc.) first on
> > the $PATH, and having the wrappers pass the
> > buck to the LLVM equivalent tools after cleaning
> > up the arguments; and 2) using the Gold plugin.
> >
> > Recently another possibility occurred to me,
> > and I'm wondering if anyone has tried it.
> > The basic idea goes like this:  A) use the
> > "strace" utility to trace the default build
> > system and log all invocations of all tools;
> > B) extract from the log a build recipe in the
> > form of tool invocations, with the default
> > tools replaced by LLVM equivalents.
> >
> > I started thinking along these lines after
> > finding some genuine madness in a build system
> > (it used AWK to munge together existing .c files
> > into new ones midway through the build).  I want
> > a method that's guaranteed to mimic faithfully
> > an arbitrarily nutty default build system, and
> > an strace-based approach seemed like a "Gordian
> > knot" solution.  However I haven't tried it yet
> > and I'm wondering if anyone else has, or if
> > anyone can think of situations where it will
> > fail.
> >
> > Thanks!
> >
> > -- Terence
> >
> >> -----Original Message-----
> >> From: Kelly, Terence P (HP Labs Researcher)
> >> Sent: Friday, July 31, 2009 6:12 PM
> >> To: 'llvmdev at cs.uiuc.edu'
> >> Cc: 'Vikram S. Adve'
> >> Subject: building whole-program bitcode with LLVM
> >>
> >>
> >> Hi,
> >>
> >> Professor Adve suggested that we post this question to llvm-dev.
> >> Thanks in advance for your advice.
> >>
> >> My colleagues and I want to create whole-program bitcode for large
> >> real programs like Apache, BIND, OpenLDAP, etc.  We want the
> >> whole-program bitcode to include every part of the program 
> for which
> >> we have source code.  For example, in the case of Apache's "httpd"
> >> server, we want to create a whole-program bitcode file "httpd.bc"
> >> containing functions that the default build system stashes 
> in various
> >> application-specific auxiliary libraries (e.g., Apache's libapr and
> >> libaprutil).
> >>
> >> Our motive is *not* link-time optimization; we're interested in
> >> analyzing and modifying the whole-program bitcode in other ways.
> >> Once we have created a whole-program bitcode, we want to compile it
> >> to native assembly, then pass it thru the native assembler & linker
> >> to obtain a native executable whose behavior (except for 
> performance)
> >> is identical to that of an executable obtained from the 
> default build
> >> system.  We do *not* want standard libraries like libc and 
> libpthread
> >> to be incorporated as bitcode in the whole-program 
> bitcode; they can
> >> be linked in at the final step, after we have converted the
> >> whole-program bitcode to native assembly and assembled & linked it.
> >>
> >> We have been able to achieve our goal for small programs consisting
> >> of a handful of translation units, so we know that our goal is
> >> attainable in principle.  Problems start when we tackle 
> big programs
> >> with complex build systems.  We want to find a generic 
> strategy that
> >> works with most real world open source C/C++ programs without too
> >> much fuss, because we want to use it on at least a dozen different
> >> programs.  Ideally we want a strategy that works with unmodified
> >> default build systems, because eventually we hope to produce a tool
> >> that is easy for other developers to use.
> >>
> >> Initially we had hoped simply to replace gcc, as, ld, etc. 
> with their
> >> LLVM counterparts in the standard build systems, but we 
> haven't been
> >> able to make that strategy work.  Several different 
> approaches along
> >> these lines fail in various ways.  Some have recommended the Gold
> >> plugin, but it's not clear from the documentation that it does what
> >> we want, and we haven't been successful in installing it yet.
> >>
> >> Does anyone have experience in constructing whole-program bitcodes
> >> that include app-specific libraries for large open-source programs?
> >> If you could share the right tricks, that would be very helpful.
> >>
> >> Thanks!
> >>
> >> -- Terence Kelly, HP Labs
> >>
> >> ________________________________
> >>
> >> From: Vikram S. Adve [mailto:vadve at cs.uiuc.edu]
> >> Sent: Friday, July 24, 2009 8:05 PM
> >> To: Kelly, Terence P (HP Labs Researcher)
> >> Cc: Swarup Sahoo
> >> Subject: Re: building complex software with LLVM
> >>
> >> Hi Terence,
> >>
> >> ...
> >>
> >> I also recommend sending any such technical
> >> questions about LLVM to llvmdev at cs.uiuc.edu.
> >> There are a large number of active (and very
> >> helpful) LLVM users on that list.  Replies
> >> go to the list so you should join the list
> >> to see them.
> >>
> >> Good luck!
> >>
> >> --Vikram
> >> Associate Professor, Computer Science
> >> University of Illinois at Urbana-Champaign
> >> http://llvm.org/~vadve
> >>
> >>
> > _______________________________________________
> > LLVM Developers mailing list
> > LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
> >
> 



More information about the llvm-dev mailing list