[LLVMdev] strace for whole-program bitcodes (was: RE: building whole-program bitcode with LLVM)

Daniel Dunbar daniel at zuster.org
Thu Oct 15 08:12:53 PDT 2009


Hi Terence,

I believe that this is in fact similar to an approach Coverity uses
(or used at one time) as a robust solution to determine what was done
during a build. I can imagine that one can build a robust system
following this technique, but it also seems like it might be quite a
bit of work.

Another possible alternative not mentioned is to teach the compiler
driver (clang, most likely) to understand how to deal with bitcode
files on platforms with no LLVM linker support. This isn't terribly
difficult, and would work as long as all access to the tools was done
through the driver (e.g., CC). There might still be problems with
build systems that call tools like ar/ld directly.

 - Daniel

On Thu, Oct 8, 2009 at 3:26 PM, Kelly, Terence P (HP Labs Researcher)
<terence.p.kelly at hp.com> wrote:
>
> Hi,
>
> It would be nice if it were easier for relative
> novices to build whole-program bitcodes for
> large, complex applications with hairy build
> systems.  Several readers of this list have
> been trying various approaches for a few months
> but as far as I know we haven't yet found a
> good general solution.  Approaches that have
> been tried include 1) placing wrappers for the
> usual tools (gcc, ar, as, ld, etc.) first on
> the $PATH, and having the wrappers pass the
> buck to the LLVM equivalent tools after cleaning
> up the arguments; and 2) using the Gold plugin.
>
> Recently another possibility occurred to me,
> and I'm wondering if anyone has tried it.
> The basic idea goes like this:  A) use the
> "strace" utility to trace the default build
> system and log all invocations of all tools;
> B) extract from the log a build recipe in the
> form of tool invocations, with the default
> tools replaced by LLVM equivalents.
>
> I started thinking along these lines after
> finding some genuine madness in a build system
> (it used AWK to munge together existing .c files
> into new ones midway through the build).  I want
> a method that's guaranteed to mimic faithfully
> an arbitrarily nutty default build system, and
> an strace-based approach seemed like a "Gordian
> knot" solution.  However I haven't tried it yet
> and I'm wondering if anyone else has, or if
> anyone can think of situations where it will
> fail.
>
> Thanks!
>
> -- Terence
>
>> -----Original Message-----
>> From: Kelly, Terence P (HP Labs Researcher)
>> Sent: Friday, July 31, 2009 6:12 PM
>> To: 'llvmdev at cs.uiuc.edu'
>> Cc: 'Vikram S. Adve'
>> Subject: building whole-program bitcode with LLVM
>>
>>
>> Hi,
>>
>> Professor Adve suggested that we post this question to llvm-dev.
>> Thanks in advance for your advice.
>>
>> My colleagues and I want to create whole-program bitcode for large
>> real programs like Apache, BIND, OpenLDAP, etc.  We want the
>> whole-program bitcode to include every part of the program for which
>> we have source code.  For example, in the case of Apache's "httpd"
>> server, we want to create a whole-program bitcode file "httpd.bc"
>> containing functions that the default build system stashes in various
>> application-specific auxiliary libraries (e.g., Apache's libapr and
>> libaprutil).
>>
>> Our motive is *not* link-time optimization; we're interested in
>> analyzing and modifying the whole-program bitcode in other ways.
>> Once we have created a whole-program bitcode, we want to compile it
>> to native assembly, then pass it thru the native assembler & linker
>> to obtain a native executable whose behavior (except for performance)
>> is identical to that of an executable obtained from the default build
>> system.  We do *not* want standard libraries like libc and libpthread
>> to be incorporated as bitcode in the whole-program bitcode; they can
>> be linked in at the final step, after we have converted the
>> whole-program bitcode to native assembly and assembled & linked it.
>>
>> We have been able to achieve our goal for small programs consisting
>> of a handful of translation units, so we know that our goal is
>> attainable in principle.  Problems start when we tackle big programs
>> with complex build systems.  We want to find a generic strategy that
>> works with most real world open source C/C++ programs without too
>> much fuss, because we want to use it on at least a dozen different
>> programs.  Ideally we want a strategy that works with unmodified
>> default build systems, because eventually we hope to produce a tool
>> that is easy for other developers to use.
>>
>> Initially we had hoped simply to replace gcc, as, ld, etc. with their
>> LLVM counterparts in the standard build systems, but we haven't been
>> able to make that strategy work.  Several different approaches along
>> these lines fail in various ways.  Some have recommended the Gold
>> plugin, but it's not clear from the documentation that it does what
>> we want, and we haven't been successful in installing it yet.
>>
>> Does anyone have experience in constructing whole-program bitcodes
>> that include app-specific libraries for large open-source programs?
>> If you could share the right tricks, that would be very helpful.
>>
>> Thanks!
>>
>> -- Terence Kelly, HP Labs
>>
>> ________________________________
>>
>> From: Vikram S. Adve [mailto:vadve at cs.uiuc.edu]
>> Sent: Friday, July 24, 2009 8:05 PM
>> To: Kelly, Terence P (HP Labs Researcher)
>> Cc: Swarup Sahoo
>> Subject: Re: building complex software with LLVM
>>
>> Hi Terence,
>>
>> ...
>>
>> I also recommend sending any such technical
>> questions about LLVM to llvmdev at cs.uiuc.edu.
>> There are a large number of active (and very
>> helpful) LLVM users on that list.  Replies
>> go to the list so you should join the list
>> to see them.
>>
>> Good luck!
>>
>> --Vikram
>> Associate Professor, Computer Science
>> University of Illinois at Urbana-Champaign
>> http://llvm.org/~vadve
>>
>>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>




More information about the llvm-dev mailing list