[cfe-dev] [RFC] Embedding compilation database info in object files.

Sean Silva silvas at purdue.edu
Wed Jul 17 21:36:44 PDT 2013


tl;dr: compiler embeds compilation db info in object files; you can then
collect it afterwards with simple tools. Good idea?

It seems like for a while now, we have been looking for a way that clang
can assist users in creating JSON compilation databases, but solutions seem
limited to specific build systems or platforms. I came up with a neat
little hack that may be a viable way for clang to help create compilation
databases "everywhere clang runs", with fairly good user experience.

I believe the following user experience is achievable "everywhere clang
runs":
1. Add some option to the compiler command line.
2. Rebuild.
3. Feed all of your built object files/executables to a small tool we ship
and out comes a compilation database.

The basic idea is that instead of generating the compilation database
"before compilation, by something that knows about the build a priori",
have the compilation database info come out "the back of the compiler a
posteriori" and follow the natural flow of information through the build
pipeline, and eventually be recovered from build products.

>From an operational standpoint, it just involves adding a small amount of
extra logic to the compiler and doing some simple postprocessing on build
products. Hence I think this may be a good fit for our situation as clang
developers: we want to provide a feature to users, but 1) don't control
users' build systems/platforms but 2) do control the compiler and can ship
small utilities alongside it.

I hacked up a minimal demo at <
https://github.com/chisophugis/clang-compdb-in-object-file> which currently
uses a compiler-wrapper to add the extra logic to the compiler. The
high-level summary of how it works is that in each TU it embeds a string
literal containing `{"directory":...,"command":...,"file":...}` in a
special section `.clang.compdb`, and then these are aggregated by the
linker; afterwards, these compilation database entries can be extracted and
put together into the final JSON compilation database. For full details of
how it does that consult the README (and/or the source); it's pretty
hackish.

As a test of the approach, I used this same essential technique to
successfully produce a compilation database from a large game (>1M lines)
without having to worry about the build system in any way (it's some sort
of Visual Studio project with a custom toolchain; I don't really understand
it very far beyond the GUI options it presents and which compiler binary it
invokes).

What I have now is just a couple hackish scripts mostly; I have no idea
what final form would be most appropriate as a user-facing feature inside
of clang. Does this seem like a good direction for helping users create
compilation databases?

-- Sean Silva
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20130717/8c8fead5/attachment.html>


More information about the cfe-dev mailing list