[cfe-dev] [RFC] Embedding compilation database info in object files.

Joshua Cranmer pidgeot18 at gmail.com
Mon Jul 22 14:27:27 PDT 2013


On 7/22/2013 3:26 PM, Sean Silva wrote:
> In dealing with game teams, each one may use a different (possibly 
> custom/private) build system/mashup of build systems, many of which 
> are closed source/proprietary (e.g. MSBuild.exe from Visual Studio). 
> I'm trying to come up with a solution that will work independently of 
> the build system, or at least with as few assumptions as possible 
> (things like "they have access to their final build products, since 
> otherwise how would they run them" and "they can modify the compiler 
> flags"). Like I said in the OP, I was able to rapidly extract a 
> compilation database from a completely unfamiliar (closed-source, 
> proprietary) build system (that I still don't understand!).

The implicit assumptions for your approach amount to the following:
1. The user can make their build system use clang.
2. The user can make their build system add compiler flags to clang.
3. The user can find all of the final build products.
4. The build system does not mutilate binaries for the final build 
products in a way that would render this unnecessary, or if this is 
false, the build system retains an intermediate copy of the products 
that has not yet been mutilated, and these intermediate copies can be 
checked.
5. The binary targets are capable of having this information, and 
capable of having this information extracted from this easily.
6. Adding this extra information would not cause the build system to fail.
7. The user is willing to add all of this extra information to their 
final build products, or to apply a post-processing step to extract all 
of this extra information.
8. The set of all build steps may be found in the union of all final 
build products.

Number 3 can be less trivial than it seems, particularly if you don't 
think to add de-duplication steps. Number 4 is definitely not 
universally true (I've used some build systems which mutilate the final 
product into a custom binary format)--and may be generally false in the 
embedded world. Number 5 I think may be iffy, and I can think of 
situations to make number 6 not true. Number 7 hampers its usability in 
some deployments that don't necessarily need to be done.

In contrast, the other proposal of "append to a JSON file" requires the 
following implicit assumptions:
1. The user can make their build system use clang.
2. The user can make their build system add compiler flags to clang.
3. There exists a single, well-known file directory that can be accessed 
by every build step.
4. There exists a way to lock that file and have only one process 
atomically update it at a time.
5. The build system will execute every build step.

Note that #3 is already implicitly assumed by compilation databases; 
when it doesn't hold, a post-processing step would need to be applied. 
This post-processing step isn't necessarily trivial. Consider that the 
main Mozilla codebase has approximately 49 build configurations [1]; 
just assuming that the working directory needs to be tweaked for the 
compiler might let me compile as many as 21 of them [2]. I could 
plausibly compile all 49 of them with excessive copies of headers (I'd 
need base headers for Windows, 32-bit and 64-bit Linux, and OS X), but I 
don't think anyone would advocate that clang should support this 
situation out of the box.

Assumption #4 is a solved problem, even on broken NFS systems, as years 
of experience with mailboxes have found the answers. The last assumption 
is actually probably almost always false, but if people are willing to 
make basic assumptions about how accurate a compilation database will be 
(i.e., I'm not going to easily get one that lets me compile for multiple 
platforms), it will tend to hold true to a first approximation.

[1] By which I mean "we have a continuous integration system that runs 
49 build configurations and expects them to keep working." The actual 
number of build configurations for which we would accept a patch if one 
broke probably exceeds 100.
[2] Actually, it would let me compile 0 of them, since I wouldn't have a 
copy of the build-generated source files.

-- 
Joshua Cranmer
News submodule owner
DXR coauthor




More information about the cfe-dev mailing list