[cfe-dev] JSONCompilationDatabase additional keys

Fri Sep 27 13:57:25 PDT 2013

(resending without some of the attachments, since the list doesn't like big
messages; ask if you want me to send the full Mathematica notebook that I
used for the analysis)

On Fri, Sep 27, 2013 at 2:59 AM, Manuel Klimek <klimek at google.com> wrote:

> On Sat, Sep 14, 2013 at 8:31 AM, Sean Silva <silvas at purdue.edu> wrote:
>
>>
>>
>>
>> On Fri, Sep 13, 2013 at 5:59 AM, Manuel Klimek <klimek at google.com> wrote:
>>
>>> On Thu, Sep 12, 2013 at 3:07 AM, Sean Silva <silvas at purdue.edu> wrote:
>>>
>>>>
>>>>
>>>>
>>>> On Wed, Sep 11, 2013 at 8:14 AM, Manuel Klimek <klimek at google.com>wrote:
>>>>
>>>>> On Wed, Sep 11, 2013 at 12:41 PM, Nicholas Gill <mythagel at gmail.com>wrote:
>>>>>
>>>>>> Hello cfe-dev,
>>>>>>
>>>>>> At present unknown keys in the compile_commands.json file will cause
>>>>>> the JSONCompilationDatabase parser to reject the file.
>>>>>>
>>>>>> 1. Would a patch to relax this constraint be accepted?
>>>>>> 2. Would other consumers of the compile_commands.json be negatively
>>>>>> impacted by unknown keys?
>>>>>>
>>>>>
>>>>> Yes. This can be rather big, and being able to quickly parse it is
>>>>> important for interactive use cases.
>>>>>
>>>>>
>>>>
>>>> I don't think that this is a very good reason. Parsing the compilation
>>>> database in the JSON format is going to take O(project size) work, and
>>>> anything O(project size) is not going to be adequate for interactive use
>>>> cases anyway.
>>>>
>>>
>>> How do you come to that conclusion? I've run benchmark on chromium-sized
>>> projects, and the interactive use-case worked just fine.
>>>
>>>
>>
>> On my machine it takes 30ms to parse the compile_commands.json for
>> clang/llvm (measured with perf(1)). Typically interactive response time
>> expectation is 100ms, so 30ms is about 1/3 of the total time available,
>> which IMO is unacceptable. Conversely, a project >3x larger would be
>> noninteractive.
>>
>
> Have you made sure it scales linearly?
> If your analysis is true, it has regressed, and we should fix the YAML
> parser.
>

Yes, I'm almost 100% sure that it scales linearly; see the attached plot
and raw data. The numbers are given in seconds. The sizes are given
proportionally to the original LLVM R+A compile_commands.json (so e.g. 0.5
means half the entries removed, 2 means double the number of entries (I
just copied existing entries)).

The timing was done with:
    perf stat -e cpu-clock clang-check -p ../r+a/ lib/IR/IRBuilder.cpp
During the course of the analysis, I made IRBuilder.cpp be an empty file.

-- Sean Silva
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20130927/cf43ea0b/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: CompilationDatabaseParsePerformanceAnalysis.png
Type: image/png
Size: 7239 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20130927/cf43ea0b/attachment.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: CompilationDatabaseParsePerformance.json
Type: application/json
Size: 2370 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20130927/cf43ea0b/attachment.json>