[cfe-dev] Getting involved with Clang refactoring

Manuel Klimek klimek at google.com
Thu May 24 04:29:38 PDT 2012

On Thu, May 24, 2012 at 10:08 AM, David Röthlisberger <david at rothlis.net>wrote:

> On 22 May 2012, at 15:17, Douglas Gregor wrote:
> > Bringing it back to 'make' a little bit... we could, conceivably, have a
> compilation database implicitly generated from the makefiles. If one asked
> it how to build 'foo.cpp', it would find the appropriate make rule and form
> the command-line arguments. We don't have such a 'live' compilation
> database right now, but it fits into the model and would be really, really
> cool because it would allow us to 'just work' on a makefile-based project.
> Unfortunately, it amounts to re-implementing 'make' :(
> >
> > There are other ways we could build compilation databases. There's CMake
> support for dumping out a compilation database; we could also add a
> -fcompilation-database=<blah> flag that creates a compilation database as
> the result of a build, which would work with any build system. That would
> also be a nice little project that would help the tooling effort.
> For the sake of readers who, like me, don't know all the background
> information, here's what I've unearthed over the last hour or two:
> 1. If you define CMAKE_EXPORT_COMPILE_COMMANDS cmake will create the file
>   compile_commands.json.
>   See http://cmake.org/gitweb?p=cmake.git;a=commitdiff;h=fe07b055
>   and http://cmake.org/gitweb?p=cmake.git;a=commitdiff;h=5674844d
>   I don't know if the format of this json file is documented anywhere, but
>   from the above commits it seems to be an array of dicts like this:
>      { "directory": "abc", "command": "g++ -xyz ...", "file": "source.cxx"
> }
> 2. Clang has a tool called scan-build that wraps an invocation of make.
>   You call it like this:
>      scan-build make
>   Scan-build intercepts the compiler by setting CXX to some script that
>   forwards on to the real compiler, and then (while it still knows all
>   the compiler flags necessary to compile this file) it invokes the
>   clang static analyzer.
>   See http://clang-analyzer.llvm.org/scan-build.html
>   and
> http://llvm.org/svn/llvm-project/cfe/trunk/tools/scan-build/scan-build
>   It's 1400 lines of perl, but most of that seems to be command-line
> options,
>   usage help, and generating html reports. The compiler-interception part
>   doesn't seem too difficult.
>   Scan-build is relevant to this discussion because one could generate a
>   compilation database using a similar interposing technique.
> 3. Something completely different: Maybe we could figure out the
> compilation
>   command-lines for all of a project's files at once by looking at the
> output
>   of "make --always-make --dry-run".
>   One difference from the lets-interpose-CXX approach is that this will
> give
>   us some command-lines that are not C++ compilations, and we'd have to
> filter
>   those out.
>   Once we do know that it's a C++ compilation command-line, we still have
> to
>   parse that command-line to figure out the name of the sourcefile (just
> like
>   the interposed CXX script has to).
> 4. Doug's suggestion: Call clang with "-fcompilation-database=foo" during
> the
>   course of a normal build. This will simultaneously compile the file and
>   add/update an entry in the compilation database. (Or maybe only do the
>   compilation database entry, requiring a separate invocation to do the
>   actual compilation?)
> Pros and cons of the various approaches:
> Cmake +  The compilation database is generated at "cmake" time -- we don't
> need
>         to do a full build.
> Cmake +  Works on Windows.
> Cmake -  (Obviously) doesn't work with non-cmake build systems.
> CXX interposing +  Probably the easiest to implement if you have a project
> that
>                   needs this *now* and you don't want to wait for a better
>                   solution to make its way into clang.
> CXX interposing +  Works with any build system as long as it is compliant
> with
>                   the CXX / CC environment variable convention.
> CXX interposing -  The interposed script has to parse the compilation
> command-
>                   line to extract the source filename. This is duplication
> of
>                   effort because clang already has to parse the
> command-line.
> CXX interposing -  Each entry to the compilation database is added as the
>                   corresponding target is being built, so in
>                   parallel/distributed builds it will have to lock the
>                   compilation database.
> make --dry-run +  Works with any make-based system (I'm not very familiar
> with
>                  non-GNU versions of make, but presumably they have similar
>                  flags), except for recursive-make systems as mentioned
> below.
> make --dry-run +  Far easier than re-implementing make.
> make --dry-run +  No need to actually build the targets.
> make --dry-run -  Like the CXX interposing technique, has to parse the
>                  compilation command-line.
> make --dry-run -  Gives you *all* the compilation commands, not just C or
> C++
>                  compilations; you'll have to filter the output for what
>                  you're interested in. Smells a bit hacky and brittle but
>                  maybe that's just my prejudices speaking.
> make --dry-run -  Doesn't work with some complex recursive-make build
> systems.
>                  For example if part of your makefile creates another
> makefile
>                  and then uses that, clearly your dry-run won't work
> unless it
>                  actually does create that second makefile. In theory make
> has
>                  ways to make this work -- see
> http://www.gnu.org/software/make/manual/html_node/MAKE-Variable.html
>                  -- but in practice I've never seen a large build system
> where
>                  dry-run works.
> clang -fcompilation-database +  Easier for the *user* than the two previous
>                                shell-script-based solutions. No mucking
> about
>                                with shell scripts: just set CXXFLAGS, run
>                                make, and you're done.
> clang -fcompilation-database +  Will work on Windows.
> clang -fcompilation-database -  Like the CXX interposing technique, has to
> lock
>                                the compilation database for parallel/
>                                distributed builds.
> clang -fcompilation-database -  Can't generate the compilation database
> without
>                                building your whole project with clang.
> That last point is more important (to me) than you might think. Say I have
> a
> large codebase and not all of it builds with clang; but for the source
> files
> that *can* be parsed by clang, I want to run some clang-based tool. Still,
> having "-fcompilation-database" in clang doesn't stop me from writing my
> own
> CXX-interposing scripts if I should need them.
> Well, that's all. I hope someone finds it useful -- I can't be the only
> one to
> have wondered how to actually get the full command-line through to
> clang-based
> tools. :-) Once we decide on an official solution let's make sure we
> document
> it well.

Hi Dave,

thanks for writing all the stuff down!

I don't think that an "official" solution for how to generate the compile
database is important, as long as
1. the format is clear
2. we support a wide range of use cases

This is open source :) People can generally implement all of the above
solutions. Some of them might not need to live inside clang's repository;
it would generally be good to have at least one solution that is as generic
as possible living inside clang without the need for 3rd party things (like
cmake or ninja). I think for that solution the switch is the best one, as
it's the only one that does not increase the dependency needs of clang
users at build time.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20120524/9e66eef5/attachment.html>

More information about the cfe-dev mailing list