[cfe-dev] Producing compilation databases (was Re: Clang-based indexer and code navigator)

David Blaikie dblaikie at gmail.com
Thu Mar 14 14:05:51 PDT 2013

On Thu, Mar 14, 2013 at 1:58 PM, Sean Silva <silvas at purdue.edu> wrote:
> On Thu, Mar 14, 2013 at 6:49 AM, David Röthlisberger <david at rothlis.net>
> wrote:
>> On 14 Mar 2013, at 10:10, Laszlo Nagy wrote:
>> >
>> > you might have seen my tool, which trying to address the compilation
>> > database problem. (Just in case if you missed
>> > <https://github.com/rizsotto/Bear>) Which is using LD_PRELOAD to catch the
>> > compiler calls... And now I am wondering what does it mean 'feels yucky'?
>> > What other, more technical, point you have against it? ;) Was testing
>> > against: scons, GNU make, qmake, cmake, bash... and it works reliable most
>> > of the cases. On solaris/BSD systems, you could use D-Trace, which also
>> > capture all exec calls, more easy. But that's another platform specific
>> > solution.
>> >
>> > My conclusion was at that time, I either write OS specific solution,
>> > which works on any build system. Or write build-tool specific solution,
>> > which works on every OS. Since I'm interested in sources which are compiles
>> > on Linux, I went for the LD_PRELOAD trick.
>> Ryan Prichard's "sw-btrace" is similar to "bear" but supports OS X &
>> FreeBSD as well as Linux. "bear" is already mentioned in
>> http://clang.llvm.org/docs/JSONCompilationDatabase.html -- we could also
>> add a mention of "sw-btrace".
>> > I got the feeling that putting this kind code into Clang would not solve
>> > the problem at all, but would Clang driver itself more complex... You still
>> > need to teach your build system to use Clang. And since you were able to do
>> > that, you can write a fake compiler, which only emit the message about it's
>> > command line arguments and generate a fake object file. (Of course you need
>> > to write fake ar/ld commands as well) But more importantly need a process
>> > which collect these messages and format into a JSON file. (By the way this
>> > is exactly what the LD_PRELOAD solution is doing, except no need for fake
>> > compiler/linker. And no need to put code into Clang.)
>> Maybe not add this to the Clang binary itself, but add a "bear" /
>> "sw-btrace" tool to the clang repository? I think it would be nice to
>> have such tools available directly from the clang project, instead of
>> having each clang-based tool invent its own or depend on yet another
>> project.
>> One benefit of this approach is that it gives the clang project the
>> flexibility to change the compilation database format without the fear
>> of breaking all these other tools. (There's still CMake, though.)
> The problem is that there are no mature and established tools for generating
> the database. As the tools improve, the problem will solve itself; there's
> no reason to bring them into clang really since the major problem is just
> developing mature tools in the first place. Having a "standardized" format
> for the compilation database decouples this development from clang itself.

I really don't understand the hesitation to build compilation database
generation into Clang.

Pretty much any Clang tool you want to run on your codebase is going
to require your codebase is Clang-clean to begin with & likely the
only way you're going to get there is by integrating Clang with your
build (OK, so you could do syntax-only stuff with Clang tools in which
case you'd never have to build your project with Clang - just parse it
- but that seems like the far less common use case for Clang & its
tools). With that in mind, why wouldn't we generate a compilation
database from Clang itself? Why would we ask users to use a specific
build system to generate a file so they can use Clang tools? That
seems like a bizarre user experience.

Obviously for distributed builds that gets a bit tricky/unrealistic -
and the ability for complex build systems to generate this file seems
not inappropriate, but I'm not sure why it's being suggested that it's
the necessary/common/expected scenario.

(even if the first pass of such support would be non-parallelizable
most build systems can be forcibly run in series to avoid any
filesystem race issues - then if/when someone has the time they can
build the necessary OS locking, etc, to do it safely concurrently)

More information about the cfe-dev mailing list