<html><head><meta http-equiv="Content-Type" content="text/html; charset=utf-8"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;" class="">Hi Alex,<div class=""><br class=""></div><div class="">Thank you for the follow-up!<br class=""><div><br class=""><blockquote type="cite" class=""><div class="">On 5 Sep 2018, at 01:53, Alex L <<a href="mailto:arphaman@gmail.com" class="">arphaman@gmail.com</a>> wrote:</div><br class="Apple-interchange-newline"><div class=""><div dir="ltr" class=""><div dir="ltr" class=""><div dir="ltr" class=""><div dir="ltr" class=""><div dir="ltr" class=""><div dir="ltr" class="">Hi,<div class=""><br class=""></div><div class="">I wrote a performance test harness for Clangd, but got delayed with the initial patch as I was out on vacation. The initial design and implementation focuses on measuring code-completion (sema-based right now) for a static project with a fixed set of sources. Before sending out this patch I would like to get some feedback related to one specific example that demonstrates how the code-completion latency can be measured and tracked for a simple project.</div><div class=""><br class=""></div><div class="">Let's say we'd like to measure the code-completion latency in 'main.cpp' at line 5 column 1. This design allows you to write an LSP test that's configured by CMake that is then executed appropriately by lit using a new test format. Here's an example of a test file that would be understood and tested by lit and the new test harness:</div><div class=""><br class=""></div><div class=""> {</div><div class=""> 'compile_commands': '@CMAKE_CURRENT_BINARY_DIR@/compile_commands.json'</div><div class=""> 'interactions': [</div><div class=""> { 'lsp': { 'method': 'textDocument/didOpen', 'params': {'textDocument':{'uri': 'MAKE_URI(@CMAKE_CURRENT_BINARY_DIR@/main.cpp)','languageId':'cpp','version':1.'text': 'LOAD_FILE(@CMAKE_CURRENT_BINARY_DIR@/main.cpp)' }} } },</div><div class=""> { 'lsp': { 'method': 'textDocument/completion', 'params': {'textDocument':{'uri': 'MAKE_URI(@CMAKE_CURRENT_BINARY_DIR@/main.cpp)'},'position':{'line':5,'character':1}} } }</div><div class=""> ],</div><div class=""> 'measure': 'SEMA_COMPLETION'</div><div class="">}</div></div></div></div></div></div></div></div></blockquote>The format seems to be human-readable, I was wondering whether you expect the general use case for the performance tracking suits pieces to be manual creation or if you have any helper tool to extract pieces of sessions you’re interested in/session actions from specific sessions you would like to target.</div><div><blockquote type="cite" class=""><div class=""><div dir="ltr" class=""><div dir="ltr" class=""><div dir="ltr" class=""><div dir="ltr" class=""><div dir="ltr" class=""><div dir="ltr" class=""><div class=""><br class=""></div><div class=""><div class="">The 'compile_commands' property would let the test harness know where the configured compilation database is.</div><div class="">The 'interactions' property contains a list of LSP/other interactions that are sent to Clangd by the test harness. The MAKE_URI and LOAD_FILE functions would be interpreted appropriately by the test harness.</div><div class="">The 'measure' property determines the key metric(s) that are being measured by the test. The CI job would ensure that the corresponding metrics are uploaded to LNT.</div></div></div></div></div></div></div></div></div></blockquote>Do you plan to support different measurements? What would be the other metrics you are considering?</div><div><br class=""></div><div>Having that on LNT would be amazing, I’d be happy to see that!</div><div><br class=""></div><div>One question for the LNT, though: what would be the project for which you set up the testing? Is that some specific small/large scale project or a fixed version of LLVM?</div><div><br class=""></div><div>Thanks for the feedback,</div><div>Kirill</div><div><blockquote type="cite" class=""><div class=""><div dir="ltr" class=""><div dir="ltr" class=""><div dir="ltr" class=""><div dir="ltr" class=""><div dir="ltr" class=""><div dir="ltr" class=""><div class=""><br class=""></div><div class="">Please let me know what you think,</div><div class="">Thanks</div><div class="">Alex</div></div></div></div></div></div></div><br class=""><div class="gmail_quote"><div dir="ltr" class="">On Tue, 21 Aug 2018 at 23:25, Kirill Bobyrev <<a href="mailto:kbobyrev.lists@gmail.com" class="">kbobyrev.lists@gmail.com</a>> wrote:<br class=""></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div style="word-wrap:break-word;line-break:after-white-space" class="">I see, thank you for the clarification! Looking forward for the patch!<div class=""><br class=""></div><div class="">-Kirill<br class=""><div class=""><br class=""><blockquote type="cite" class=""><div class="">On 21 Aug 2018, at 22:48, Alex L <<a href="mailto:arphaman@gmail.com" target="_blank" class="">arphaman@gmail.com</a>> wrote:</div><br class="m_4849605771609097637Apple-interchange-newline"><div class=""><div dir="ltr" style="font-family:Helvetica;font-size:12px;font-style:normal;font-variant-caps:normal;font-weight:normal;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;text-decoration:none" class=""><br class=""><br class=""><div class="gmail_quote"><div dir="ltr" class="">On Tue, 21 Aug 2018 at 04:36, Kirill Bobyrev <<a href="mailto:kbobyrev.lists@gmail.com" target="_blank" class="">kbobyrev.lists@gmail.com</a>> wrote:<br class=""></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex"><div dir="ltr" class="">Hi Alex,<div class=""><br class=""></div><div class="">I agree with the multiple modes strategy, it would be great to enable tracking both sema + index and index performance without any additional cost, if the framework would be generic enough that would be even better! The other responses are inline.</div><br class=""><div class="gmail_quote"><div dir="ltr" class="">On Thu, Aug 16, 2018 at 2:04 AM Alex L <<a href="mailto:arphaman@gmail.com" target="_blank" class="">arphaman@gmail.com</a>> wrote:<br class=""></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex"><div dir="ltr" class="">Thanks for your responses!<div class=""><br class=""></div><div class="">I realize now that I should have been more specific when it comes to completion latency. We're currently interested in sema completion latency, but the infrastructure that I would like to set up will support latency with the completion results obtained from the index as well.</div><div class="">Essentially, for a completion test-case we would like to have the option to run it in two / three modes:</div><div class="">- just sema completion</div><div class="">- index completion or sema + index completion</div><div class="">Note that we don't have to test a completion test-case in all modes, so we could just have a sema based completion test.</div><div class=""></div><div class=""><br class=""></div><div class="">This way we'll be able to identify the regressions in a particular component (sema vs index) in a better way. Do you think this idea works for you?</div><div class=""><br class=""></div><div class="">More responses inline:</div><div class=""><br class=""></div><div class=""><div class="gmail_quote"><div dir="ltr" class="">On Tue, 14 Aug 2018 at 00:31, Eric Liu <<a href="mailto:ioeric@google.com" target="_blank" class="">ioeric@google.com</a>> wrote:<br class=""></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex"><br class=""><br class=""><div class="gmail_quote"><div dir="ltr" class="">On Tue, Aug 14, 2018, 08:40 Kirill Bobyrev <<a href="mailto:kbobyrev.lists@gmail.com" target="_blank" class="">kbobyrev.lists@gmail.com</a>> wrote:<br class=""></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex"><div dir="ltr" class="">Hi Alex,<div class=""><br class=""></div><div class="">Such test-suite might be very useful and it'd be great to have it. As Eric mentioned, I am working on pulling benchmark library into LLVM. Although I fell behind over the past week due to the complications with libc++ (you can follow the thread here: <a href="http://lists.llvm.org/pipermail/llvm-dev/2018-August/125176.html" target="_blank" class="">http://lists.llvm.org/pipermail/llvm-dev/2018-August/125176.html</a>). </div></div></blockquote></div></blockquote><div class=""><br class=""></div><div class="">Thanks! Do you a general idea of how you would like to use the benchmarking library?</div></div></div></div></blockquote><div class="">I've looked into benchmark usage in libc++ and test-suite, but they weren't very helpful because they seem to be very specific there. I've started looking into pulling the library into LLVM (<a href="https://reviews.llvm.org/D50894" target="_blank" class="">https://reviews.llvm.org/D50894</a>) but I have few concerns there. </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex"><div dir="ltr" class=""><div class=""><div class="gmail_quote"><div class="">I'm mainly interested in a more complete test that we could run using some sort of harness and whose results can be fed into LNT.</div></div></div></div></blockquote><div class="">Can you please elaborate on what you mean by feeding results into LNT? Are you thinking about controlling the latency and failing the "benchmark tests" as soon as the latency is beyond some limit or are you interested in building LNT targets which you can run along unittests?</div></div></div></blockquote><div class=""><br class=""></div><div class="">By feeding results I mean basically uploading and storing them in the LNT database. We won't really need to run the test-suite using LNT, as the performance harness will take care of it.</div><div class=""><br class=""></div><div class="">When it comes to the CI, we won't try to fail the tests because of the latency while running them. We will instead run all of tests and will upload the performance results to the store. Then we will run a follow-up CI job that will compare the gathered results and will check for big regressions against a baseline. </div><div class=""><br class=""></div><div class="">We should also have a way to run the tests locally that would check for regressions right after the tests are done so it would be possible to do local pre-commit testing.</div><div class=""> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex"><div dir="ltr" class=""><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex"><div dir="ltr" class=""><div class=""><div class="gmail_quote"><div class=""> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex"><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex"><div dir="ltr" class=""><div class=""><br class=""></div><div class="">Eric, Ilya and I have been discussing a possible "cheap" solution - a tool which user can feed a compilation database and which could process some queries (maybe in YAML format, too). This would allow a realistic benchmark (since you could simply feed LLVM codebase or anything else with the size you're aiming for) and be relatively easy to implement. The downside of such approach would be that it would require some setup effort. As an alternative, it might be worth feeding YAML symbol index instead of the compilation commands, because currently the global symbol builder is not very efficient. I am looking into that issue, too; we have few ideas what the performance bottlenecks in global-symbol-builder can be and how to fix them, hopefully I will make the tool way faster soon.</div></div></blockquote></div><div class="">Note that sema latency is something we also need to take into consideration, as it's always part of code completion flow, with or without index. </div></blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex"><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex"><div dir="ltr" class=""><div class=""><br class=""></div><div class="">In the long term, however, I think the LLVM Community is also interested in benchmarking other tools which exist under the LLVM umbrella, so I think that opting in for the Benchmark approach would be more beneficial. Having an infrastructure based on LNT that we could run either on some buildbots or locally would be even better. The downside is that it might turn out to be really hard to maintain a realistic test-suite, e.g. storing YAML dump of the static index somewhere would be hard because we wouldn't want 300+ Mb files in the tree but hosting it somewhere else and downloading would also potentially introduce additional complexity. On the other hand, generating a realistic index programmatically might also be hard.</div></div></blockquote></div></blockquote><div class=""><br class=""></div><div class=""><div class="">I don't have a strong opinion for how the index should be stored. However, I think it's helpful to breakdown this problem into different categories, and look at three kinds of indexing data sets:</div><div class="">- index data set that's derived from a part of the LLVM umbrella (llvm/clang/test-suite/whatever). </div><div class=""> <span class="m_4849605771609097637Apple-converted-space"> </span>=> One possible solution: this index can be rebuilt on every run.</div></div></div></div></div></blockquote><div class="">Yes, but that unfortunately takes too long at the moment. I started looking into that and fixed a YAML serialization performance problem (<a href="https://reviews.llvm.org/D50839" target="_blank" class="">https://reviews.llvm.org/D50839</a>), but there are few other bottlenecks left.</div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex"><div dir="ltr" class=""><div class=""><div class="gmail_quote"><div class=""><div class="">- index data set that's derived form a project outside of the LLVM umbrella.</div><div class=""> <span class="m_4849605771609097637Apple-converted-space"> </span>=> One possible solution: This index can be stored as an archive of YAML files in one of the LLVM repos.</div></div></div></div></div></blockquote><div class="">That's one of doing it, right, but I'm not sure any LLVM repo would like to store a 300 Mb YAML file and update it over time. However, I don't know if there are already any cases like this and whether it might be acceptable. </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex"><div dir="ltr" class=""><div class=""><div class="gmail_quote"><div class=""><div class="">- auto generated index data?</div></div></div></div></div></blockquote><div class=""> While this might be the most appealing option, the generation of a realistic index might turn out to be hard. However, I think we should have couple of artificial indices in the benchmarks, it might be beneficial.</div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex"><div dir="ltr" class=""><div class=""><div class="gmail_quote"><div class=""><div class="">It would probably be valuable to have different kinds of index data sets. </div></div></div></div></div></blockquote><div class="">Agreed, that would also mean more coverage.</div><div class=""><br class=""></div><div class="">Relatively "cheap" solutions which I'm thinking about are:</div><div class=""><br class=""></div><div class="">* Recording user session and mirroring the input file to Clangd to measure the performance. That would eliminate the complexity of creating a realistic benchmark without investing too much effort into the benchmark itself. However, it might turn out to be hard to track the performance contributions of individual components. Also, I'm not sure if it's generic enough.</div></div></div></blockquote><div class=""><br class=""></div><div class="">Recording the user session can certainly be very appealing, but I'm not sure how well it will translate into a realistic benchmark. I suppose it depends on the particular performance issue. Nevertheless, It would be really good to have this capability to help us investigate performance issues.</div><div class=""> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex"><div dir="ltr" class=""><div class="gmail_quote"><div class="">* Creating a tool which would accept YAML symbol dump, build an index and get a set of requests (e.g. from another file) to measure total completion latency. That solves most of my problems, but is tied to the index testing usecase which is not enough for comprehensive performance tracking.</div><div class=""><br class=""></div><div class="">I unfortunately didn't get any good idea of how to build comprehensive performance tracking pipeline yet, e.g. how to continuously get index for (e.g.) LLVM, adjust buildbots, measure performance, ensure that it's realistic, etc. </div></div></div></blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex"><div dir="ltr" class=""><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex"><div dir="ltr" class=""><div class=""><div class="gmail_quote"><div class=""> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex"><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex"><div dir="ltr" class=""><div class=""><br class=""></div><div class="">Having said that, convenient infrastructure for benchmarking which would align with the LNT and wouldn't require additional effort from the users would be amazing and we are certainly interested in collaboration. What models of the benchmarks have you considered and what do you think about the options described above?</div></div></blockquote></div></blockquote><div class=""><br class=""></div><div class="">For the sema based completion latency tracking I would like to start off with two simple things to get some basic infrastructure working:</div><div class="">- C++ test-case: measuring sema code-completion latency (with preamble) in a file from a fixed revision of Clang.</div><div class="">- ObjC test-case: similar to above, some ObjC code with a portion.</div><div class="">One issue is that it the system headers that will be used are not static, which leads to issues like the baseline might be out of date when the SDK on the GreenDragon bots is updated.</div><div class=""><br class=""></div><div class="">Ideally I would use some harness based on compile commands. Each test file would have a compilation command entry in the database.</div><div class="">I was also thinking that the test command could be fed into Clangd using LSP itself. Similarly to how code-completion is requested in Clangd's regression test, we could write a test that would send in the LSP commands into Clangd. Or maybe the test harness could generate them from some sort of test description (e.g. test completion at these locations at that file).</div><div class=""><br class=""></div><div class="">The latency could be measured by scanning the output of the run of Clangd with CLANGD_TRACE.</div><div class=""><br class=""></div><div class="">The test harness would then capture the result and upload it to LNT. A subsequent bot would check for big regressions (e.g. +10%) against the baseline (or previous result). </div></div></div></div></blockquote><div class="">Sounds good to me!</div></div></div></blockquote><div class=""><br class=""></div><div class="">I'm hoping to put up a patch for a prototype implementation sometime this week.</div><div class=""><br class=""></div><div class="">Cheers,</div><div class="">Alex</div><div class=""> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex"><div dir="ltr" class=""><div class="gmail_quote"><div class=""><br class=""></div><div class="">Kind regards,</div><div class="">Kirill Bobyrev</div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex"><div dir="ltr" class=""><div class=""><div class="gmail_quote"><div class=""><br class=""></div><div class="">Cheers,</div><div class="">Alex</div><div class=""> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex"><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex"><div dir="ltr" class=""><div class=""><br class=""></div><div class="">Kind regards,</div><div class="">Kirill Bobyrev</div></div><br class=""><div class="gmail_quote"><div dir="ltr" class="">On Tue, Aug 14, 2018 at 7:35 AM Eric Liu via clangd-dev <<a href="mailto:clangd-dev@lists.llvm.org" target="_blank" class="">clangd-dev@lists.llvm.org</a>> wrote:<br class=""></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex"><div class="">Hi Alex,</div><div class=""><br class=""></div><div class="">Kirill is working on pulling google benchmark library into llvm and adding benchmarks to clangd. We are also mostly interested in code completion latency and index performance at this point. We don't have a very clear idea on how to create realistic benchmarks yet e.g. what code to use, what static index corpus to use. I wonder if you have ideas here.</div><div class=""><br class=""></div><div class="">Another option that might be worth considering is adding a tool that runs clangd code completion on some existing files in the llvm/clang codebase. It can potentially measure both code completion quality and latency.</div><div class=""><br class=""></div><div class="">-Eric</div><div class=""><div class="gmail_quote"><div dir="ltr" class="">On Tue, Aug 14, 2018, 00:53 Alex L via clangd-dev <<a href="mailto:clangd-dev@lists.llvm.org" target="_blank" class="">clangd-dev@lists.llvm.org</a>> wrote:<br class=""></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex"><div dir="ltr" class="">Hi,<div class=""><br class=""></div><div class="">I'm currently investigating and putting together a plan for open-source and internal performance tracking infrastructure for Clangd. </div><div class=""><br class=""></div><div class="">Initially we're interested in one particular metric:</div><div class="">- Code-completion latency</div><div class=""><br class=""></div><div class="">I would like to put together infrastructure that's based on LNT and that would identify performance regressions that arise as new commits come in. From the performance issues I've observed in our libclang stack the existing test-suite that exist in LLVM does not really reproduce the performance issues that we see in practice well enough. In my opinion we should create some sort of editor performance test-suite that would be unrelated to the test-suite that's used for compile time and performance tracking. WDYT?</div><div class=""><br class=""></div><div class="">I'm wondering if there are any other folks looking at this at the moment as well. If yes, I would like to figure out a way to collaborate on a solution that would satisfy all of our requirements. Please let me know if you have ideas in terms of how we should be running the tests / what the test-suite should be, or what you needs are.</div><div class=""><br class=""></div><div class="">Thanks,</div><div class="">Alex</div></div>_______________________________________________<br class="">clangd-dev mailing list<br class=""><a href="mailto:clangd-dev@lists.llvm.org" target="_blank" class="">clangd-dev@lists.llvm.org</a><br class=""><a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/clangd-dev" rel="noreferrer" target="_blank" class="">http://lists.llvm.org/cgi-bin/mailman/listinfo/clangd-dev</a><br class=""></blockquote></div></div>_______________________________________________<br class="">clangd-dev mailing list<br class=""><a href="mailto:clangd-dev@lists.llvm.org" target="_blank" class="">clangd-dev@lists.llvm.org</a><br class=""><a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/clangd-dev" rel="noreferrer" target="_blank" class="">http://lists.llvm.org/cgi-bin/mailman/listinfo/clangd-dev</a></blockquote></div></blockquote></div></blockquote></div></div></div></blockquote></div></div></blockquote></div></div></div></blockquote></div><br class=""></div></div></blockquote></div>
</div></blockquote></div><br class=""></div></body></html>