[LLVMdev] Compiler Driver [high-level comments]

Wed Jul 28 11:26:47 PDT 2004

On Wed, 28 Jul 2004, Reid Spencer wrote:
> 2. MODE OF OPERATION
> ====================
> The driver will simply read its command line arguments, read its
> configuration data, and invoke the compilation, linking, and
> optimization tools necessary to complete the user's request. Its basic

I'm not sure that I agree with this.  Compilers need to be extremely
predictable and simple.  In particular, saying:

llvmgcc x.c y.c z.c

should invoke exactly the same tools as:

llvmgcc x.c -c
llvmgcc y.c -c
llvmgcc z.c -c
llvmgcc x.o y.o z.o

I don't necessarily think that you're contradicting this, I just wanted to
make sure we're on the same page.

> 4. SIMILAR OPTIONS AS GCC
> =========================
> Certain common GCC options should be supported in order to make the
> driver appear familiar to users of GCC. In particular, the following
> options are important to preserve:

Very important, I agree.

> Additionally, we should have options to:
> * generate analysis reports ala the LLVM analyze tool

I'm not certain how useful this would be.  It would add complexity to the
driver that is of arguable use.  If anything I would make this the last
priority: the people who use 'analyze' are compiler developers, not end
users.

> * have a "no op" mode like -v where it just reports what it would do
> * have a language specific help utility based on suffixes. For example,
>   --help ll would list the options applicable to *.ll input files. This
>   would extend to source languages too (e.g. --help c for C help or
>   --help f for FORTRAN help). The generated help info would be specific
>   for the given language, after the config files have been read thus
>   allowing the output to vary depending on the driver's configuration.
> * Support the -- option to terminate command line options and indicate
>   the remaining options are files to be processed. This
> * Support command line configuration (override config files on the
>   command line) either by specifying a config file or using special
>   configuration options.
> * each option should have short (-X) and long (--language) variants

Sure.

> 5. BASIC/STANDARD COMPILATION TASKS
> ===================================
> The driver will perform basic tasks such as compilation, optimization,
> and linking. The following definitions are suggested, but more could be
> supported.

There has been a lot of discussion/confusion on IRC relating to what
actually will go into .s or .o files.  In particular, some people were
arguing that if we output a .o file, that it should only contain native
code.  This means that these two commands would do very different things:

llvmgcc x.c -o x.o     # compile to native .o
llvmgcc x.c -o x.bc    # compile to bytecode

I have to say that I *strenuously* object to this behavior.  In
particular, this would require all users to change their makefiles to get
IPO/lifelong optzn support from LLVM, violating one of the main goals of
the system.

There are a couple of things that people brought up (including wrapping
.bc files in ELF sections, generating .o files containing native
code+.bc), but here is the proposal that I like best:  :)

I don't think that anything should change w.r.t. the contents of .o files.
In particular, .o files should contain LLVM bytecode without wrappers or
anything fancy around them.  The big problem with this is compiler
interoperability, in particular, mixing .o files from various compilers
(e.g. a native GCC) will not work (e.g. 'ld' will barf when it hits an
LLVM .o file).

Personally I don't see a problem with this.  We already have "llvm aware"
replacements for many system tools, including ld, nm, and a start for ar.
These tools could be made 'native aware', so that 'llvm-ld x.o b.o' would
do the right thing for mixed native and llvm .o files.  Imagine an
llvm-objdump tool that either runs the native objdump program or llvm-dis
depending on the file type.

The one major thing that I want to fix is the current kludge of using
llvmgcc -S or llvmgcc -c to control whether the compile-time optimizer is
run.  The only reason we did this was because it was easy, and a new
compiler driver is exactly what we need to fix this.  In particular, I
would really like to see something like this:

llvmgcc X.c -S     # compiles, runs gccas, emits an *optimized* .ll file
llvmgcc X.c -c     # Same as -S, but now in .bc form instead of .ll form
llvmgcc X.c -On -S # "no" optimization, emit a 'raw' .ll file
llvmgcc X.c -On -c # "no" optimization, emit a 'raw' .bc file

Basically, today's equivalents to these are:

llvmgcc X.c -c -o - | llvm-dis > X.s
llvmgcc X.c -c
llvmgcc X.c -S
llvmgcc X.c -S -o - | llvm-as > X.o

The ability to capture the raw output of a front-end is very useful and
important, but it should be controlled with -O options, not -S/-c.  Also,
llvmgcc -O0 is not necessary the same as -On, because some optimizations
actually speed up compilation (e.g., dead code elim).

Anyway, these are just some high-level ideas.

-Chris

-- 
http://llvm.cs.uiuc.edu/
http://nondot.org/sabre/