<div dir="ltr"><div>tl;dr: compiler embeds compilation db info in object files; you can then collect it afterwards with simple tools. Good idea?</div><div><br></div><div>It seems like for a while now, we have been looking for a way that clang can assist users in creating JSON compilation databases, but solutions seem limited to specific build systems or platforms. I came up with a neat little hack that may be a viable way for clang to help create compilation databases "everywhere clang runs", with fairly good user experience.</div>
<div><br></div><div><div>I believe the following user experience is achievable "everywhere clang runs":</div><div>1. Add some option to the compiler command line.</div><div>2. Rebuild.</div><div>3. Feed all of your built object files/executables to a small tool we ship and out comes a compilation database.</div>
</div><div><br></div><div>The basic idea is that instead of generating the compilation database "before compilation, by something that knows about the build a priori", have the compilation database info come out "the back of the compiler a posteriori" and follow the natural flow of information through the build pipeline, and eventually be recovered from build products.</div>
<div><br></div><div>From an operational standpoint, it just involves adding a small amount of extra logic to the compiler and doing some simple postprocessing on build products. Hence I think this may be a good fit for our situation as clang developers: we want to provide a feature to users, but 1) don't control users' build systems/platforms but 2) do control the compiler and can ship small utilities alongside it.</div>
<div><br></div><div>I hacked up a minimal demo at <<a href="https://github.com/chisophugis/clang-compdb-in-object-file">https://github.com/chisophugis/clang-compdb-in-object-file</a>> which currently uses a compiler-wrapper to add the extra logic to the compiler. The high-level summary of how it works is that in each TU it embeds a string literal containing `{"directory":...,"command":...,"file":...}` in a special section `.clang.compdb`, and then these are aggregated by the linker; afterwards, these compilation database entries can be extracted and put together into the final JSON compilation database. For full details of how it does that consult the README (and/or the source); it's pretty hackish.</div>
<div><br></div><div>As a test of the approach, I used this same essential technique to successfully produce a compilation database from a large game (>1M lines) without having to worry about the build system in any way (it's some sort of Visual Studio project with a custom toolchain; I don't really understand it very far beyond the GUI options it presents and which compiler binary it invokes).</div>
<div><br></div><div>What I have now is just a couple hackish scripts mostly; I have no idea what final form would be most appropriate as a user-facing feature inside of clang. Does this seem like a good direction for helping users create compilation databases?</div>
<div><br></div><div>-- Sean Silva</div>
</div>