[cfe-dev] [RFC] Embedding compilation database info in object files.

Fri Jul 19 16:10:40 PDT 2013

On Thu, Jul 18, 2013 at 11:39 PM, Manuel Klimek <klimek at google.com> wrote:

> On Thu, Jul 18, 2013 at 11:20 PM, Sean Silva <silvas at purdue.edu> wrote:
>
>>
>>
>>
>> On Thu, Jul 18, 2013 at 5:08 AM, Manuel Klimek <klimek at google.com> wrote:
>>
>>> On Thu, Jul 18, 2013 at 12:28 PM, Sean Silva <silvas at purdue.edu> wrote:
>>>
>>>> On Wed, Jul 17, 2013 at 9:44 PM, Manuel Klimek <klimek at google.com>wrote:
>>>>>
>>>>> We have done similar things before internally, but considered it more
>>>>> to be a hack ;)
>>>>>
>>>>> I think the direction that we want to go to is to have an option in
>>>>> clang to append to a compilation database while running - that way, no
>>>>> post-processing step is required, which again needs to be somehow put into
>>>>> the build flow. The only part missing is somebody with enough time on their
>>>>> hands for whom this is high enough priority.
>>>>>
>>>>
>>>> Wouldn't this approach (appending to a compilation database) have
>>>> issues with filesystem contention and/or write atomicity in
>>>> multicore/distributed builds (without involving a "real database" for the
>>>> database storage)?
>>>>
>>>
>>> On Unix systems we can handle that via file locks. On windows we'd need
>>> a windows expert :P
>>>
>>>
>>>> Also, wouldn't a post-processing step be needed in order to remove
>>>> outdated entries appended from a previous incremental build (consider:
>>>> `make; <rename some file in the project>; make`)?
>>>>
>>>
>>> Well, we could require a rebuild to update the database (basically rm
>>> the compilation database, make clean && rebuild).
>>>
>>>
>>>> The approach I proposed has two extremely desirable properties that I
>>>> think would be hard to achieve with an approach that carries the
>>>> information in an external "side channel", as in the approach you suggested:
>>>> 1. The compilation database info is always up to date as long as the
>>>> build products are up to date, since the information follows the "causal
>>>> chain" leading to the final programs/libraries.
>>>>
>>>
>>> Wouldn't it have exactly the same "delete" problem? When I rename a .cc
>>> file, won't most build systems leave the .o just lying around?
>>>
>>
>> The use case I primarily envision is sourcing the compdb info in the
>> usual case from "final" build products, like executables and libraries. In
>> that case, the old .o would not be linked into the final build product and
>> hence its compilation database info would not be included; there would be
>> issues if one of the final build products is renamed though, but I think
>> that is relatively rare, and we can document this particular caveat. In
>> other cases (even when sourcing .o's), I think a useful, actionable
>> diagnostic can be emitted ("compilation database entry found in file foo.o
>> doesn't seem to correspond to any source file; skip it? delete it?").
>>
>
> Normally a project has multiple "final build products". The reason we have
> the compilation database is that given a source file, you want to be able
> to parse it. If I give you a source file, how do you know which of the
> final build products you look into to get the information? All of them?
> Have yet another database?
>
>

See my response to Chandler (the one that talks about "getting information
from A to B") which hopefully clarifies where I'm coming from (writing it
certainly clarified things for me!). Simply embedding this information in
(existing) build products is not necessarily an "ultimate solution";
however, it is a very widely applicable simplification of the problem
"given a C/C++ project, produce a compilation database for it".

-- Sean Silva
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20130719/025bf45c/attachment.html>