[LLVMdev] How the LLVM tools work together

Sat Oct 30 00:40:15 PDT 2010

Thanks Michael,

Your information was extremely helpful. Coming from a non-compiler background it is interesting to see how all the different components go together, from the code developers write to the final output.

Cheers,

Stephen

On 30/10/2010, at 6:25 PM, Michael Spencer wrote:

> On Thu, Oct 28, 2010 at 4:41 PM, Stephen Norman <stenorman2001 at me.com> wrote:
>> Hi,
>> 
>> I've been reading through some of the documentation and I'm a little confused.
>> 
>> What I'm wondering is if someone could explain how the different tools in LLVM (llvmc, clang, llvm-gcc, llvm-ar, etc.) work together to go from the C code I create through to a running executable (after linking).
>> 
>> Apologies if this isn't the right list. I'm not a compiler developer so I'm rather a novice with how LLVM works.
>> 
>> Cheers,
>> 
>> Stephen
> 
> Most of the tools are really just compiler hacker tools that we use
> for development, test, and demonstration. LLVM is designed to be used
> as a set of libraries instead of a set of tools. However, there's
> nothing stopping you (and it can be quite informative) to do each step
> individually.
> 
> clang contains a driver, much like gcc, that takes the source files
> and options you provide and produces the desired output. This can be
> anything from just preprocessing all the way down to a final
> executable.
> 
> So the command:
> % clang -O3 source.c -o prog.exe
> 
> Can be broken down into:
> 
> * Pre-process
> % clang -E source.c -o source.ii
> * Compile to the llvm intermediate representation
>    - This file is a human readable representation of the c input code
> for the specified target.
> % clang -S -emit-llvm source.ii -o source.ll
> * Optimize
>    - This runs a set of optimizations on source.ll and outputs the
> optimized version in a binary encoded version of the llvm-ir. Use the
> -S option to get readable output.
> % opt -O3 source.ll -o source-opt.bc
> * Generate machine code
>    - This lowers the llvm-ir to the target instruction set and
> optimizes it along the way.
> % llc -O3 source-opt.bc -o source.s
> * Assemble
> % as source.s -o source.o
> * Link
> % ld source.o -o prog.exe
> 
> clang doesn't directly run all these commands. It uses the libraries
> internally to do everything up to assembly output, and on some
> platforms it even does the assembling internally.
> 
> * llvm-{as,dis} are just used to convert to and from the bitcode and
> human readable llvm-ir.
> * llvm-ar is for creating standard archives containing bitcode.
> * llvmc ... I'm still confused about the exact reason for this one.
> * llvm-diff produces intelligent diffs between two llvm-ir files
> ignoring names. Makes it much easier to tell what semantics changed
> when values are renamed.
> * llvm-ld is really just a driver for the system linker. It can also
> produce scripts that run the bitcode via lli.
> * llvm-link links llvm-ir files together.
> * llvm-mc is the machine code playground. It can be used as an
> assembler, dissembler, and other things.
> * llvm-nm is classic unix nm for llvm-ir. It dumps the symbol table.
> 
> And I don't know what the rest are for exactly.
> 
> You don't need to know about any of these to use clang or llvm-gcc,
> but they can be useful when playing with llvm.
> 
> - Michael Spencer