<div class="gmail_quote">On Thu, May 24, 2012 at 10:08 AM, David Röthlisberger <span dir="ltr"><<a href="mailto:david@rothlis.net" target="_blank">david@rothlis.net</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div class="im">On 22 May 2012, at 15:17, Douglas Gregor wrote:<br>
> Bringing it back to 'make' a little bit... we could, conceivably, have a compilation database implicitly generated from the makefiles. If one asked it how to build 'foo.cpp', it would find the appropriate make rule and form the command-line arguments. We don't have such a 'live' compilation database right now, but it fits into the model and would be really, really cool because it would allow us to 'just work' on a makefile-based project. Unfortunately, it amounts to re-implementing 'make' :(<br>
><br>
> There are other ways we could build compilation databases. There's CMake support for dumping out a compilation database; we could also add a -fcompilation-database=<blah> flag that creates a compilation database as the result of a build, which would work with any build system. That would also be a nice little project that would help the tooling effort.<br>
<br>
<br>
<br>
</div>For the sake of readers who, like me, don't know all the background<br>
information, here's what I've unearthed over the last hour or two:<br>
<br>
1. If you define CMAKE_EXPORT_COMPILE_COMMANDS cmake will create the file<br>
compile_commands.json.<br>
<br>
See <a href="http://cmake.org/gitweb?p=cmake.git;a=commitdiff;h=fe07b055" target="_blank">http://cmake.org/gitweb?p=cmake.git;a=commitdiff;h=fe07b055</a><br>
and <a href="http://cmake.org/gitweb?p=cmake.git;a=commitdiff;h=5674844d" target="_blank">http://cmake.org/gitweb?p=cmake.git;a=commitdiff;h=5674844d</a><br>
<br>
I don't know if the format of this json file is documented anywhere, but<br>
from the above commits it seems to be an array of dicts like this:<br>
<br>
{ "directory": "abc", "command": "g++ -xyz ...", "file": "source.cxx" }<br>
<br>
<br>
2. Clang has a tool called scan-build that wraps an invocation of make.<br>
You call it like this:<br>
<br>
scan-build make<br>
<br>
Scan-build intercepts the compiler by setting CXX to some script that<br>
forwards on to the real compiler, and then (while it still knows all<br>
the compiler flags necessary to compile this file) it invokes the<br>
clang static analyzer.<br>
<br>
See <a href="http://clang-analyzer.llvm.org/scan-build.html" target="_blank">http://clang-analyzer.llvm.org/scan-build.html</a><br>
and <a href="http://llvm.org/svn/llvm-project/cfe/trunk/tools/scan-build/scan-build" target="_blank">http://llvm.org/svn/llvm-project/cfe/trunk/tools/scan-build/scan-build</a><br>
<br>
It's 1400 lines of perl, but most of that seems to be command-line options,<br>
usage help, and generating html reports. The compiler-interception part<br>
doesn't seem too difficult.<br>
<br>
Scan-build is relevant to this discussion because one could generate a<br>
compilation database using a similar interposing technique.<br>
<br>
<br>
3. Something completely different: Maybe we could figure out the compilation<br>
command-lines for all of a project's files at once by looking at the output<br>
of "make --always-make --dry-run".<br>
<br>
One difference from the lets-interpose-CXX approach is that this will give<br>
us some command-lines that are not C++ compilations, and we'd have to filter<br>
those out.<br>
<br>
Once we do know that it's a C++ compilation command-line, we still have to<br>
parse that command-line to figure out the name of the sourcefile (just like<br>
the interposed CXX script has to).<br>
<br>
<br>
4. Doug's suggestion: Call clang with "-fcompilation-database=foo" during the<br>
course of a normal build. This will simultaneously compile the file and<br>
add/update an entry in the compilation database. (Or maybe only do the<br>
compilation database entry, requiring a separate invocation to do the<br>
actual compilation?)<br>
<br>
<br>
Pros and cons of the various approaches:<br>
<br>
Cmake + The compilation database is generated at "cmake" time -- we don't need<br>
to do a full build.<br>
<br>
Cmake + Works on Windows.<br>
<br>
Cmake - (Obviously) doesn't work with non-cmake build systems.<br>
<br>
CXX interposing + Probably the easiest to implement if you have a project that<br>
needs this *now* and you don't want to wait for a better<br>
solution to make its way into clang.<br>
<br>
CXX interposing + Works with any build system as long as it is compliant with<br>
the CXX / CC environment variable convention.<br>
<br>
CXX interposing - The interposed script has to parse the compilation command-<br>
line to extract the source filename. This is duplication of<br>
effort because clang already has to parse the command-line.<br>
<br>
CXX interposing - Each entry to the compilation database is added as the<br>
corresponding target is being built, so in<br>
parallel/distributed builds it will have to lock the<br>
compilation database.<br>
<br>
make --dry-run + Works with any make-based system (I'm not very familiar with<br>
non-GNU versions of make, but presumably they have similar<br>
flags), except for recursive-make systems as mentioned below.<br>
<br>
make --dry-run + Far easier than re-implementing make.<br>
<br>
make --dry-run + No need to actually build the targets.<br>
<br>
make --dry-run - Like the CXX interposing technique, has to parse the<br>
compilation command-line.<br>
<br>
make --dry-run - Gives you *all* the compilation commands, not just C or C++<br>
compilations; you'll have to filter the output for what<br>
you're interested in. Smells a bit hacky and brittle but<br>
maybe that's just my prejudices speaking.<br>
<br>
make --dry-run - Doesn't work with some complex recursive-make build systems.<br>
For example if part of your makefile creates another makefile<br>
and then uses that, clearly your dry-run won't work unless it<br>
actually does create that second makefile. In theory make has<br>
ways to make this work -- see<br>
<a href="http://www.gnu.org/software/make/manual/html_node/MAKE-Variable.html" target="_blank">http://www.gnu.org/software/make/manual/html_node/MAKE-Variable.html</a><br>
-- but in practice I've never seen a large build system where<br>
dry-run works.<br>
<br>
clang -fcompilation-database + Easier for the *user* than the two previous<br>
shell-script-based solutions. No mucking about<br>
with shell scripts: just set CXXFLAGS, run<br>
make, and you're done.<br>
<br>
clang -fcompilation-database + Will work on Windows.<br>
<br>
clang -fcompilation-database - Like the CXX interposing technique, has to lock<br>
the compilation database for parallel/<br>
distributed builds.<br>
<br>
clang -fcompilation-database - Can't generate the compilation database without<br>
building your whole project with clang.<br>
<br>
That last point is more important (to me) than you might think. Say I have a<br>
large codebase and not all of it builds with clang; but for the source files<br>
that *can* be parsed by clang, I want to run some clang-based tool. Still,<br>
having "-fcompilation-database" in clang doesn't stop me from writing my own<br>
CXX-interposing scripts if I should need them.<br>
<br>
Well, that's all. I hope someone finds it useful -- I can't be the only one to<br>
have wondered how to actually get the full command-line through to clang-based<br>
tools. :-) Once we decide on an official solution let's make sure we document<br>
it well.<br></blockquote><div><br></div><div>Hi Dave,</div><div><br></div><div>thanks for writing all the stuff down!</div><div><br></div><div>I don't think that an "official" solution for how to generate the compile database is important, as long as</div>
<div>1. the format is clear</div><div>2. we support a wide range of use cases</div><div><br></div><div>This is open source :) People can generally implement all of the above solutions. Some of them might not need to live inside clang's repository; it would generally be good to have at least one solution that is as generic as possible living inside clang without the need for 3rd party things (like cmake or ninja). I think for that solution the switch is the best one, as it's the only one that does not increase the dependency needs of clang users at build time.</div>
<div><br></div><div>Thoughts?</div><div>/Manuel</div></div>