[clangd-dev] Investigating performance tracking infrastructure

Tue Aug 21 04:36:43 PDT 2018

Hi Alex,

I agree with the multiple modes strategy, it would be great to enable
tracking both sema + index and index performance without any additional
cost, if the framework would be generic enough that would be even better!
The other responses are inline.

On Thu, Aug 16, 2018 at 2:04 AM Alex L <arphaman at gmail.com> wrote:

> Thanks for your responses!
>
> I realize now that I should have been more specific when it comes to
> completion latency. We're currently interested in sema completion latency,
> but the infrastructure that I would like to set up will support latency
> with the completion results obtained from the index as well.
> Essentially, for a completion test-case we would like to have the option
> to run it in two / three modes:
> - just sema completion
> - index completion or sema + index completion
> Note that we don't have to test a completion test-case in all modes, so we
> could just have a sema based completion test.
>
> This way we'll be able to identify the regressions in a particular
> component (sema vs index) in a better way. Do you think this idea works for
> you?
>
> More responses inline:
>
> On Tue, 14 Aug 2018 at 00:31, Eric Liu <ioeric at google.com> wrote:
>
>>
>>
>> On Tue, Aug 14, 2018, 08:40 Kirill Bobyrev <kbobyrev.lists at gmail.com>
>> wrote:
>>
>>> Hi Alex,
>>>
>>> Such test-suite might be very useful and it'd be great to have it. As
>>> Eric mentioned, I am working on pulling benchmark library into LLVM.
>>> Although I fell behind over the past week due to the complications with
>>> libc++ (you can follow the thread here:
>>> http://lists.llvm.org/pipermail/llvm-dev/2018-August/125176.html).
>>>
>>
> Thanks! Do you a general idea of how you would like to use the
> benchmarking library?
>
I've looked into benchmark usage in libc++ and test-suite, but  they
weren't very helpful because they seem to be very specific there. I've
started looking into pulling the library into LLVM (
https://reviews.llvm.org/D50894) but I have few concerns there.

> I'm mainly interested in a more complete test that we could run using some
> sort of harness and whose results can be fed into LNT.
>
Can you please elaborate on what you mean by feeding results into LNT? Are
you thinking about controlling the latency and failing the "benchmark
tests" as soon as the latency is beyond some limit or are you interested in
building LNT targets which you can run along unittests?

>
>
>>
>>> Eric, Ilya and I have been discussing a possible "cheap" solution - a
>>> tool which user can feed a compilation database and which could process
>>> some queries (maybe in YAML format, too). This would allow a realistic
>>> benchmark (since you could simply feed LLVM codebase or anything else with
>>> the size you're aiming for) and be relatively easy to implement. The
>>> downside of such approach would be that it would require some setup effort.
>>> As an alternative, it might be worth feeding YAML symbol index instead of
>>> the compilation commands, because currently the global symbol builder is
>>> not very efficient. I am looking into that issue, too; we have few ideas
>>> what the performance bottlenecks in global-symbol-builder can be and how to
>>> fix them, hopefully I will make the tool way faster soon.
>>>
>> Note that sema latency is something we also need to take into
>> consideration, as it's always part of code completion flow, with or without
>> index.
>>
>
>>> In the long term, however, I think the LLVM Community is also interested
>>> in benchmarking other tools which exist under the LLVM umbrella, so I think
>>> that opting in for the Benchmark approach would be more beneficial. Having
>>> an infrastructure based on LNT that we could run either on some buildbots
>>> or locally would be even better. The downside is that it might turn out to
>>> be really hard to maintain a realistic test-suite, e.g. storing YAML dump
>>> of the static index somewhere would be hard because we wouldn't want 300+
>>> Mb files in the tree but hosting it somewhere else and downloading would
>>> also potentially introduce additional complexity. On the other hand,
>>> generating a realistic index programmatically might also be hard.
>>>
>>
> I don't have a strong opinion for how the index should be stored. However,
> I think it's helpful to breakdown this problem into different categories,
> and look at three kinds of indexing data sets:
> - index data set that's derived from a part of the LLVM umbrella
> (llvm/clang/test-suite/whatever).
>   => One possible solution: this index can be rebuilt on every run.
>
Yes, but that unfortunately takes too long at the moment. I started looking
into that and fixed a YAML serialization performance problem (
https://reviews.llvm.org/D50839), but there are few other bottlenecks left.

> - index data set that's derived form a project outside of the LLVM
> umbrella.
>   => One possible solution: This index can be stored as an archive of YAML
> files in one of the LLVM repos.
>
That's one of doing it, right, but I'm not sure any LLVM repo would like to
store a 300 Mb YAML file and update it over time. However, I don't know if
there are already any cases like this and whether it might be acceptable.

> - auto generated index data?
>
 While this might be the most appealing option, the generation of a
realistic index might turn out to be hard. However, I think we should have
couple of artificial indices in the benchmarks, it might be beneficial.

> It would probably be valuable to have different kinds of index data sets.
>
Agreed, that would also mean more coverage.

Relatively "cheap" solutions which I'm thinking about are:

* Recording user session and mirroring the input file to Clangd to measure
the performance. That would eliminate the complexity of creating a
realistic benchmark without investing too much effort into the benchmark
itself. However, it might turn out to be hard to track the performance
contributions of individual components. Also, I'm not sure if it's generic
enough.
* Creating a tool which would accept YAML symbol dump, build an index and
get a set of requests (e.g. from another file) to measure total completion
latency. That solves most of my problems, but is tied to the index testing
usecase which is not enough for comprehensive performance tracking.

I unfortunately didn't get any good idea of how to build comprehensive
performance tracking pipeline yet, e.g. how to continuously get index for
(e.g.) LLVM, adjust buildbots, measure performance, ensure that it's
realistic, etc.

>
>
>>
>>> Having said that, convenient infrastructure for benchmarking which would
>>> align with the LNT and wouldn't require additional effort from the users
>>> would be amazing and we are certainly interested in collaboration. What
>>> models of the benchmarks have you considered and what do you think about
>>> the options described above?
>>>
>>
> For the sema based completion latency tracking I would like to start off
> with two simple things to get some basic infrastructure working:
> - C++ test-case: measuring sema code-completion latency (with preamble) in
> a file from a fixed revision of Clang.
> - ObjC test-case: similar to above, some ObjC code with a portion.
> One issue is that it the system headers that will be used are not static,
> which leads to issues like the baseline might be out of date when the SDK
> on the GreenDragon bots is updated.
>
> Ideally I would use some harness based on compile commands. Each test file
> would have a compilation command entry in the database.
> I was also thinking that the test command could be fed into Clangd using
> LSP itself. Similarly to how code-completion is requested in Clangd's
> regression test, we could write a test that would send in the LSP commands
> into Clangd. Or maybe the test harness could generate them from some sort
> of test description (e.g. test completion at these locations at that file).
>
> The latency could be measured by scanning the output of the run of Clangd
> with CLANGD_TRACE.
>
> The test harness would then capture the result and upload it to LNT. A
> subsequent bot would check for big regressions (e.g. +10%) against the
> baseline (or previous result).
>
Sounds good to me!

Kind regards,
Kirill Bobyrev

>
> Cheers,
> Alex
>
>
>>
>>> Kind regards,
>>> Kirill Bobyrev
>>>
>>> On Tue, Aug 14, 2018 at 7:35 AM Eric Liu via clangd-dev <
>>> clangd-dev at lists.llvm.org> wrote:
>>>
>>>> Hi Alex,
>>>>
>>>> Kirill is working on pulling google benchmark library into llvm and
>>>> adding benchmarks to clangd. We are also mostly interested in code
>>>> completion latency and index performance at this point. We don't have a
>>>> very clear idea on how to create realistic benchmarks yet e.g. what code to
>>>> use, what static index corpus to use. I wonder if you have ideas here.
>>>>
>>>> Another option that might be worth considering is adding a tool that
>>>> runs clangd code completion on some existing files in the llvm/clang
>>>> codebase. It can potentially measure both code completion quality and
>>>> latency.
>>>>
>>>> -Eric
>>>> On Tue, Aug 14, 2018, 00:53 Alex L via clangd-dev <
>>>> clangd-dev at lists.llvm.org> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> I'm currently investigating and putting together a plan for
>>>>> open-source and internal performance tracking infrastructure for Clangd.
>>>>>
>>>>> Initially we're interested in one particular metric:
>>>>> - Code-completion latency
>>>>>
>>>>> I would like to put together infrastructure that's based on LNT and
>>>>> that would identify performance regressions that arise as new commits come
>>>>> in. From the performance issues I've observed in our libclang stack the
>>>>> existing test-suite that exist in LLVM does not really reproduce the
>>>>> performance issues that we see in practice well enough. In my opinion we
>>>>> should create some sort of editor performance test-suite that would be
>>>>> unrelated to the test-suite that's used for compile time and performance
>>>>> tracking. WDYT?
>>>>>
>>>>> I'm wondering if there are any other folks looking at this at the
>>>>> moment as well. If yes, I would like to figure out a way to collaborate on
>>>>> a solution that would satisfy all of our requirements. Please let me know
>>>>> if you have ideas in terms of how we should be running the tests /  what
>>>>> the test-suite should be, or what you needs are.
>>>>>
>>>>> Thanks,
>>>>> Alex
>>>>> _______________________________________________
>>>>> clangd-dev mailing list
>>>>> clangd-dev at lists.llvm.org
>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/clangd-dev
>>>>>
>>>> _______________________________________________
>>>> clangd-dev mailing list
>>>> clangd-dev at lists.llvm.org
>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/clangd-dev
>>>>
>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/clangd-dev/attachments/20180821/63bfa0c2/attachment-0001.html>