[cfe-dev] RFC: Integrating clang-cc functionality into clang (the driver)
Daniel Dunbar
daniel at zuster.org
Tue Nov 3 13:07:28 PST 2009
Hi all,
I've been thinking lately about how we can push forward with our goal of
integrating the 'clang-cc' functionality into the 'clang' executable, so that we
have a single compiler binary. This will also unblock future work on clang APIs,
and hopefully make it easier to support new interesting uses of clang.
Heres my proposal:
--
Goals
--
1. Make it easier to build clang based tools (from an API perspective).
2. Avoid unnecessary fork/exec of clang-cc.
a. Makes it easier to debug!
b. Make driver / compiler interaction more obviously a private
implementation
detail.
Non-Goals
--
1. Add a general purpose mechanism for extending 'clang' (e.g., a plugin
model). This work will make that easier, however.
Proposal (user level)
--
1. Driver gets a new option -cc1, which must be the leading argument (after any
-ccc arguments, but those are "internal" and not supposed to be used by users
anyway). This is a "mode", the remaining arguments will be processed "like"
clang-cc arguments. This is just for debuggability, and for use in -v or -###.
In practice, the arguments will be processed by hand or by reusing the driver
argument parsing functionality instead of using LLVM's command line library.
For example, where 'clang' currently does something like:
--
$ clang -S -x c /dev/null -###
...
"/Volumes/Data/ddunbar/llvm.obj.64/Debug/bin/clang-cc" "-triple"
"x86_64-apple-darwin10.0" "-S" "-disable-free" "-main-file-name"
"null" "--relocation-model" "pic" "-pic-level=1" "--disable-fp-elim"
"--unwind-tables=1" "--mcpu=core2" "--fmath-errno=0" "-fexceptions=0"
"-fdiagnostics-show-option" "-o" "null.s" "-x" "c" "/dev/null"
--
it would now print:
--
$ clang -S -x c /dev/null -###
...
"/Volumes/Data/ddunbar/llvm.obj.64/Debug/bin/clang" "-cc1" "-triple"
"x86_64-apple-darwin10.0" "-S" "-disable-free" "-main-file-name"
"null" "--relocation-model" "pic" "-pic-level=1" "--disable-fp-elim"
"--unwind-tables=1" "--mcpu=core2" "--fmath-errno=0" "-fexceptions=0"
"-fdiagnostics-show-option" "-o" "null.s" "-x" "c" "/dev/null"
--
and that command would actually work when run on the command line.
The reason for choosing -cc1, is that this is the traditional gcc style name
for the "compiler" (versus the "driver"), and to make it more obvious that this
is an "internal" option, not a user level one.
The initial focus for -cc1 would be to implement the clang-cc options that the
driver uses, but it would be easy to add support for some additional clang-cc
modes at the same time (for example, -ast-dump).
2. 'clang' gets a new option -no-integrated-cc1 which would just execute
'clang' recursively passing the -cc1 argument. Primarily only for testing,
users shouldn't have a good reason to use this.
3. We'll take some steps to still be friendly if clang crashes (currently the
driver tries to at least print a canonical "error: clang-cc failed" type of
message).
Proposal (implementation)
--
1. There will be a new class CompilerInstance (suggestions for a better name
welcome) which holds all of the state needed for running Clang. That is, this
will wrap the source manager, the file manager, the preprocessor factory, the
AST context, the AST consumer, and all that horrible stuff. This will probably
actually be constructed via a builder.
2. Internally there will be a CompilerInvocation object which maintains the
various bits of state that forms a single invocation of clang-cc (include
paths, target options, triple, code generation options, etc.).
a. The CompilerInvocation object will have two important methods, the first
converts the invocation into a list of 'clang -cc1' arguments. The second
"executes" the invocation and returns a CompilerInstance instance.
b. The Driver will get a new CompilerJob class which just wraps a
CompilerInvocation. The Driver's Clang tool implementation will be changed to
construct an instance of this object instead of constructing a list of
arguments. This job will take care of running the clang compiler in/out-of
process depending on -no-integrated-cc1, but otherwise is just an adaptor for
CompilerInvocation.
c. There will be a method to turn a 'clang -cc1' argument list into a
CompilerInvocation object.
3. The Driver will get a new API for parsing a "gcc-like" argument list which
corresponds to a single "compile only" task (-fsyntax-only, -S, etc.), and
returns a CompilerJob. This API will return an error for argument vectors which
would do something more complicated, for example executing multiple
compilations or running the linker or assembler.
4. Move "standard" tests to use 'clang -cc1' instead of 'clang-cc'.
The Future of clang-cc
--
clang-cc is kind of a mess, so at least initially I'd rather just move the
driver and appropriate tests to using the 'clang' executable. Once that's done
we can reevaluate and see what the next step is. One option is to keep clang-cc
around as a dumping/play ground for tools or other features that don't fit into
the "compiler" model of functionality. Another option is to extend 'clang' to
support the main features of clang-cc we care about (i.e., the ones we test) and
move everything else into separate tools (which would probably only be
optionally built -- these would amount to examples).
Impact
--
This redivision of clang/clang-cc and new API hooks open up our architecture in
a few nice ways.
1. It becomes much easier to implement a Clang based tool which leverages the
Driver library to provide a gcc-like command line interface.
The idea is that a client would use the new Driver API to construct a
CompilerJob, and could then twiddle the CompilerInvocation object or the
CompilerInstance object to implement their tool (for example, supplying their
own AST consumer).
2. We retain some reasonable semantics for -### and -v that closely
match existing
behavior.
3. If we desire to keep clang-cc, we should be able to move a large part of its
internals to using CompilerInvocation and CompilerInstance which should make it
easier to understand and maintain.
4. Programmaticly driving the compiler (i.e., implementing a fixed function
Clang based tool that doesn't need to process a gcc-like command line) should
be *much* easier. Those clients will have the option of constructing a
CompilerInvocation object, or using a CompilerInstance object directly.
5. This will make the connection between the driver and the compiler more
rigorous, for example the driver will not be capable of passing an option to
clang-cc that it doesn't understand.
6. This should make it easier to build new tools which need more information
about how the compiler is invoked. For example, a long standing wish of mine is
to add a mode to the driver which will automatically produce test cases, which
requires knowing how the compiler was invoked, then being able to easily
manipulate the command line to generate a preprocessed input, eliminate command
line arguments, reduce optimization level, etc.
Caveats
--
1. One major caveat is the current use of the LLVM command line library to
interact with the back end. For example, the driver currently passes options
like '--relocation-model=pic' to clang-cc. This option isn't actually defined
in clang-cc, rather it is defined in the LLVM code generator and things work
out because of how LLVM's command line handling works.
This is both a wart and a benefit -- its a wart in that its a hidden
dependency, and it blocks using the API's safely in some contexts (for example,
from multiple threads). It's a benefit because it provides a generic mechanism
for twiddling options in the back end for debugging or testing new features.
My current plan is to not try to solve this problem, but instead support some
generic argument vector (a list of strings) in the CompilerInvocation object
which will get passed to LLVM command line parsing library when the invocation
is executed. We should endeavor to never use that mechanism for any features
that matter, but this requires us to add proper API mechanisms for setting
things like the relocation model.
2. Chime in!
--
Comments?
- Daniel
More information about the cfe-dev
mailing list