[cfe-dev] Getting involved with Clang refactoring

Arnaud de Grandmaison arnaud.allarddegrandmaison at parrot.com
Thu May 24 04:52:58 PDT 2012

On 05/24/2012 01:29 PM, Manuel Klimek wrote:
> On Thu, May 24, 2012 at 10:08 AM, David Röthlisberger
> <david at rothlis.net <mailto:david at rothlis.net>> wrote:
>     On 22 May 2012, at 15:17, Douglas Gregor wrote:
>     > Bringing it back to 'make' a little bit... we could,
>     conceivably, have a compilation database implicitly generated from
>     the makefiles. If one asked it how to build 'foo.cpp', it would
>     find the appropriate make rule and form the command-line
>     arguments. We don't have such a 'live' compilation database right
>     now, but it fits into the model and would be really, really cool
>     because it would allow us to 'just work' on a makefile-based
>     project. Unfortunately, it amounts to re-implementing 'make' :(
>     >
>     > There are other ways we could build compilation databases.
>     There's CMake support for dumping out a compilation database; we
>     could also add a -fcompilation-database=<blah> flag that creates a
>     compilation database as the result of a build, which would work
>     with any build system. That would also be a nice little project
>     that would help the tooling effort.
>     For the sake of readers who, like me, don't know all the background
>     information, here's what I've unearthed over the last hour or two:
>     1. If you define CMAKE_EXPORT_COMPILE_COMMANDS cmake will create
>     the file
>       compile_commands.json.
>       See http://cmake.org/gitweb?p=cmake.git;a=commitdiff;h=fe07b055
>       and http://cmake.org/gitweb?p=cmake.git;a=commitdiff;h=5674844d
>       I don't know if the format of this json file is documented
>     anywhere, but
>       from the above commits it seems to be an array of dicts like this:
>          { "directory": "abc", "command": "g++ -xyz ...", "file":
>     "source.cxx" }
>     2. Clang has a tool called scan-build that wraps an invocation of
>     make.
>       You call it like this:
>          scan-build make
>       Scan-build intercepts the compiler by setting CXX to some script
>     that
>       forwards on to the real compiler, and then (while it still knows all
>       the compiler flags necessary to compile this file) it invokes the
>       clang static analyzer.
>       See http://clang-analyzer.llvm.org/scan-build.html
>       and
>     http://llvm.org/svn/llvm-project/cfe/trunk/tools/scan-build/scan-build
>       It's 1400 lines of perl, but most of that seems to be
>     command-line options,
>       usage help, and generating html reports. The
>     compiler-interception part
>       doesn't seem too difficult.
>       Scan-build is relevant to this discussion because one could
>     generate a
>       compilation database using a similar interposing technique.
>     3. Something completely different: Maybe we could figure out the
>     compilation
>       command-lines for all of a project's files at once by looking at
>     the output
>       of "make --always-make --dry-run".
>       One difference from the lets-interpose-CXX approach is that this
>     will give
>       us some command-lines that are not C++ compilations, and we'd
>     have to filter
>       those out.
>       Once we do know that it's a C++ compilation command-line, we
>     still have to
>       parse that command-line to figure out the name of the sourcefile
>     (just like
>       the interposed CXX script has to).
>     4. Doug's suggestion: Call clang with "-fcompilation-database=foo"
>     during the
>       course of a normal build. This will simultaneously compile the
>     file and
>       add/update an entry in the compilation database. (Or maybe only
>     do the
>       compilation database entry, requiring a separate invocation to
>     do the
>       actual compilation?)
>     Pros and cons of the various approaches:
>     Cmake +  The compilation database is generated at "cmake" time --
>     we don't need
>             to do a full build.
>     Cmake +  Works on Windows.
>     Cmake -  (Obviously) doesn't work with non-cmake build systems.
>     CXX interposing +  Probably the easiest to implement if you have a
>     project that
>                       needs this *now* and you don't want to wait for
>     a better
>                       solution to make its way into clang.
>     CXX interposing +  Works with any build system as long as it is
>     compliant with
>                       the CXX / CC environment variable convention.
>     CXX interposing -  The interposed script has to parse the
>     compilation command-
>                       line to extract the source filename. This is
>     duplication of
>                       effort because clang already has to parse the
>     command-line.
>     CXX interposing -  Each entry to the compilation database is added
>     as the
>                       corresponding target is being built, so in
>                       parallel/distributed builds it will have to lock the
>                       compilation database.
>     make --dry-run +  Works with any make-based system (I'm not very
>     familiar with
>                      non-GNU versions of make, but presumably they
>     have similar
>                      flags), except for recursive-make systems as
>     mentioned below.
>     make --dry-run +  Far easier than re-implementing make.
>     make --dry-run +  No need to actually build the targets.
>     make --dry-run -  Like the CXX interposing technique, has to parse the
>                      compilation command-line.
>     make --dry-run -  Gives you *all* the compilation commands, not
>     just C or C++
>                      compilations; you'll have to filter the output
>     for what
>                      you're interested in. Smells a bit hacky and
>     brittle but
>                      maybe that's just my prejudices speaking.
>     make --dry-run -  Doesn't work with some complex recursive-make
>     build systems.
>                      For example if part of your makefile creates
>     another makefile
>                      and then uses that, clearly your dry-run won't
>     work unless it
>                      actually does create that second makefile. In
>     theory make has
>                      ways to make this work -- see
>      http://www.gnu.org/software/make/manual/html_node/MAKE-Variable.html
>                      -- but in practice I've never seen a large build
>     system where
>                      dry-run works.
>     clang -fcompilation-database +  Easier for the *user* than the two
>     previous
>                                    shell-script-based solutions. No
>     mucking about
>                                    with shell scripts: just set
>     CXXFLAGS, run
>                                    make, and you're done.
>     clang -fcompilation-database +  Will work on Windows.
>     clang -fcompilation-database -  Like the CXX interposing
>     technique, has to lock
>                                    the compilation database for parallel/
>                                    distributed builds.
>     clang -fcompilation-database -  Can't generate the compilation
>     database without
>                                    building your whole project with clang.
>     That last point is more important (to me) than you might think.
>     Say I have a
>     large codebase and not all of it builds with clang; but for the
>     source files
>     that *can* be parsed by clang, I want to run some clang-based
>     tool. Still,
>     having "-fcompilation-database" in clang doesn't stop me from
>     writing my own
>     CXX-interposing scripts if I should need them.
>     Well, that's all. I hope someone finds it useful -- I can't be the
>     only one to
>     have wondered how to actually get the full command-line through to
>     clang-based
>     tools. :-) Once we decide on an official solution let's make sure
>     we document
>     it well.
> Hi Dave,
> thanks for writing all the stuff down!
> I don't think that an "official" solution for how to generate the
> compile database is important, as long as
> 1. the format is clear
> 2. we support a wide range of use cases
> This is open source :) People can generally implement all of the above
> solutions. Some of them might not need to live inside clang's
> repository; it would generally be good to have at least one solution
> that is as generic as possible living inside clang without the need
> for 3rd party things (like cmake or ninja). I think for that solution
> the switch is the best one, as it's the only one that does not
> increase the dependency needs of clang users at build time.
> Thoughts?
> /Manuel

Hi Manuel & Dave,

Although the switch makes it easy to be a self-contained solution, this
is not generic enough to cover an important use case : people may not be
using clang for compiling their code, but still want all the clang
goodies (code completion, ...) thru an external tool. This is for
example the case when using clang_complete with vim : you are not forced
to compile your project with clang.


Arnaud de Grandmaison

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20120524/a8b88a90/attachment.html>

More information about the cfe-dev mailing list