<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
<style type="text/css" style="display:none;"><!-- P {margin-top:0;margin-bottom:0;} --></style>
</head>
<body dir="ltr">
<div id="divtagdefaultwrapper" style="font-size:12pt;color:#000000;font-family:Calibri,Helvetica,sans-serif;" dir="ltr">
<p style="margin-top:0;margin-bottom:0">Hi Eric,</p>
<p style="margin-top:0;margin-bottom:0"><br>
</p>
<p style="margin-top:0;margin-bottom:0">Thanks a lot for the clarifications. From the sound of it, we are heading in a similar direction.</p>
<p style="margin-top:0;margin-bottom:0"><br>
</p>
<p style="margin-top:0;margin-bottom:0">Cheers,<br>
</p>
<p style="margin-top:0;margin-bottom:0">Marc-André<br>
</p>
</div>
<hr style="display:inline-block;width:98%" tabindex="-1">
<div id="divRplyFwdMsg" dir="ltr"><font face="Calibri, sans-serif" style="font-size:11pt" color="#000000"><b>From:</b> Eric Liu <ioeric@google.com><br>
<b>Sent:</b> Tuesday, July 17, 2018 5:01:06 PM<br>
<b>To:</b> Marc-André Laperle<br>
<b>Cc:</b> clangd-dev@lists.llvm.org; Kirill Bobyrev; via cfe-dev<br>
<b>Subject:</b> Re: [cfe-dev] RFC: Symbol index for Clangd design proposal</font>
<div> </div>
</div>
<meta content="text/html; charset=utf-8">
<div>
<div dir="ltr"><br>
<div class="x_gmail_quote">
<div dir="ltr">On Tue, Jul 17, 2018 at 8:11 PM Marc-André Laperle via cfe-dev <<a href="mailto:cfe-dev@lists.llvm.org">cfe-dev@lists.llvm.org</a>> wrote:<br>
</div>
<blockquote class="x_gmail_quote" style="margin:0 0 0 .8ex; border-left:1px #ccc solid; padding-left:1ex">
<div dir="ltr">
<div id="x_m_-4639693677737068069divtagdefaultwrapper" dir="ltr" style="font-size:12pt; color:#000000; font-family:Calibri,Helvetica,sans-serif">
<div>Hi Kirill,<br>
<br>
Thanks a lot for posting this proposal! I have a few questions that are maybe a bit more high-level. About the "static index" mentioned, when would it be updated? If I remember correctly, I think for Google the static index might be a remote server and I assume
it would be updated periodically when new commits are applied on the repo? Just making sure I understand where your use case. When you mentioned the proposal would be implemented by the end of September, would that still use the YAML as the static index storage?</div>
</div>
</div>
</blockquote>
<div>Note that Kirill's design is trying to address the problem of serving collected symbols efficiently (i.e. implementating SymbolIndex) instead of "indexing"/collecting symbols for static index.</div>
<div><br>
</div>
<div>One of the main goals here is to make it work well for both small dynamic index and large static index. And there should be no restriction on how symbols are collected. For clangd, static index is just a "SymbolIndex" that does not change within the span
of a clangd instance. The symbols can come from the YAML global-symbol-builder or from index-while-build, and we don't expect the symbol index implementation to change dramatically across different scenarios. So to answer your question, yes, the short term
plan is to continue using the offline-built YAML symbol table (global-symbol-buider) as the symbol source for the static index :) The yaml stuff is experimental, and we would also like to move to index-while-build in the future.<br>
</div>
<br class="x_inbox-inbox-Apple-interchange-newline">
<blockquote class="x_gmail_quote" style="margin:0 0 0 .8ex; border-left:1px #ccc solid; padding-left:1ex">
<div dir="ltr">
<div id="x_m_-4639693677737068069divtagdefaultwrapper" dir="ltr" style="font-size:12pt; color:#000000; font-family:Calibri,Helvetica,sans-serif">
<div><br>
If not, what did you have in mind for a more typical usage of Clangd on a single machine? Would the static index be the the unit and record files (index-while-building)?<span style="font-family:sans-serif; font-size:small"> </span></div>
</div>
</div>
</blockquote>
<blockquote class="x_gmail_quote" style="margin:0 0 0 .8ex; border-left:1px #ccc solid; padding-left:1ex">
<div dir="ltr">
<div id="x_m_-4639693677737068069divtagdefaultwrapper" dir="ltr" style="font-size:12pt; color:#000000; font-family:Calibri,Helvetica,sans-serif">
<div><span style="font-family:sans-serif; font-size:small"> </span><span style="font-family:sans-serif; font-size:small"> </span></div>
</div>
</div>
</blockquote>
<blockquote class="x_gmail_quote" style="margin:0 0 0 .8ex; border-left:1px #ccc solid; padding-left:1ex">
<div dir="ltr">
<div id="x_m_-4639693677737068069divtagdefaultwrapper" dir="ltr" style="font-size:12pt; color:#000000; font-family:Calibri,Helvetica,sans-serif">
<div>We were thinking along those lines: when a file is changed and saved, Clangd starts a background indexing task and updates the corresponding unit/record files. Unsaved files would be the dynamic index.<span style="font-family:sans-serif; font-size:small"> </span><span style="font-family:sans-serif; font-size:small"> </span></div>
</div>
</div>
</blockquote>
<div>This sounds like a good optimization that can be useful in index-while-build integration. </div>
<blockquote class="x_gmail_quote" style="margin:0 0 0 .8ex; border-left:1px #ccc solid; padding-left:1ex">
<div dir="ltr">
<div id="x_m_-4639693677737068069divtagdefaultwrapper" dir="ltr" style="font-size:12pt; color:#000000; font-family:Calibri,Helvetica,sans-serif">
<div>Now, the index of "USR to record-files" (and other global-level info) could be generated when Clangd is started by reading all unit/record files and then kept in memory. I haven't done measurements on how fast that would be, but judging from the presentation
last fall [1], it was taking a few seconds on LLVM/Clang. I could imagine this taking a few minutes every time Clangd is started on a bigger code base. So next step would be to persist that index to disk for a greater speed-up, using LMDB or similar.<span style="font-family:sans-serif; font-size:small"> </span><span style="font-family:sans-serif; font-size:small"> </span></div>
</div>
</div>
</blockquote>
<div>
<div>Another interesting problem with index-while-build is how to build a full symbol index (with both fuzzy find and USR->record lookup support) quickly for symbols from all TUs, when a new clangd instance is started. From our experience with global-symbol-builder,
merging symbols across all TUs can be very expensive (>20 mins for LLVM/Clang!). YAML may contribute to some slowness here, and I would expect bit-format files used in index-while-build to speed up serialization/deserialization. But merging symbols from TUs
can still be slow, so we might end up needing persistent storage for merged symbols and/or the symbol index. Obviously, this has to be measured when we have index-while-build.</div>
</div>
<blockquote class="x_gmail_quote" style="margin:0 0 0 .8ex; border-left:1px #ccc solid; padding-left:1ex">
<div dir="ltr">
<div id="x_m_-4639693677737068069divtagdefaultwrapper" dir="ltr" style="font-size:12pt; color:#000000; font-family:Calibri,Helvetica,sans-serif">
<div><br>
By defining well the interface for the static index, I think it should be possible to support both scenarios (local/index-while-building vs remote).<br>
<br>
For the local scenario with background indexing, I had made a rough prototype many months ago in order to just do basic testing of the "index while building" patches. We would like to join the effort in providing that kind of functionality to Clangd but it
is not clear how to proceed. I am thinking, in the short term, we could help getting the "index-while-building" patches reviewed and accepted. But it would be good to make sure we are heading in the same direction and coordinate on what needs to be done.</div>
</div>
</div>
</blockquote>
<div>Using index-while-building as the source of static index is also our long term goal, so we would definitely like to get aligned and also help in getting the index-while-build patches landed. I think there will be a large design space to integrate index-while-build
into clangd, and a dedicated design doc would probably be a good starting point. But I think the index-while-building effort is not in the scope of Kirill's design, which should focus on designing a performant symbol index for all scenarios. </div>
<div><br>
</div>
<div>Cheers,</div>
<div>Eric</div>
<blockquote class="x_gmail_quote" style="margin:0 0 0 .8ex; border-left:1px #ccc solid; padding-left:1ex">
<div dir="ltr">
<div id="x_m_-4639693677737068069divtagdefaultwrapper" dir="ltr" style="font-size:12pt; color:#000000; font-family:Calibri,Helvetica,sans-serif">
<div><span style="font-family:sans-serif; font-size:small"> </span></div>
</div>
</div>
</blockquote>
<blockquote class="x_gmail_quote" style="margin:0 0 0 .8ex; border-left:1px #ccc solid; padding-left:1ex">
<div dir="ltr">
<div id="x_m_-4639693677737068069divtagdefaultwrapper" dir="ltr" style="font-size:12pt; color:#000000; font-family:Calibri,Helvetica,sans-serif">
<div><br>
<br>
Regards,<br>
Marc-André</div>
<br>
<p style="margin-top:0; margin-bottom:0">[1] <a href="https://www.youtube.com/watch?v=jGJhnIT-D2M" class="x_m_-4639693677737068069OWAAutoLink" id="x_m_-4639693677737068069LPlnk218374" target="_blank">
</a><a href="https://youtu.be/jGJhnIT-D2M?t=940" class="x_m_-4639693677737068069OWAAutoLink" id="x_m_-4639693677737068069LPlnk788392" target="_blank">https://youtu.be/jGJhnIT-D2M?t=940</a><br>
</p>
<p style="margin-top:0; margin-bottom:0"><br>
<a href="https://www.youtube.com/watch?v=jGJhnIT-D2M" class="x_m_-4639693677737068069OWAAutoLink" id="x_m_-4639693677737068069LPlnk218374" target="_blank"></a></p>
<p style="margin-top:0; margin-bottom:0"><br>
</p>
</div>
<hr style="display:inline-block; width:98%">
<div id="x_m_-4639693677737068069divRplyFwdMsg" dir="ltr"><font face="Calibri, sans-serif" color="#000000" style="font-size:11pt"><b>From:</b> cfe-dev <<a href="mailto:cfe-dev-bounces@lists.llvm.org" target="_blank">cfe-dev-bounces@lists.llvm.org</a>> on behalf
of Kirill Bobyrev via cfe-dev <<a href="mailto:cfe-dev@lists.llvm.org" target="_blank">cfe-dev@lists.llvm.org</a>><br>
<b>Sent:</b> Monday, July 16, 2018 6:03:52 AM<br>
<b>To:</b> <a href="mailto:clangd-dev@lists.llvm.org" target="_blank">clangd-dev@lists.llvm.org</a><br>
<b>Cc:</b> Peter Collingbourne via cfe-dev<br>
<b>Subject:</b> [cfe-dev] RFC: Symbol index for Clangd design proposal</font>
<div> </div>
</div>
</div>
<div dir="ltr">
<div>
<div dir="ltr">
<div>Dear LLVM Community,</div>
<div><br>
</div>
<div>over the past few weeks, we (Google C++ Language Tools Team) have been working on the efficient symbol index proposal for Clangd. The goal is to improve overall Clangd performance by reducing the latency of different kinds of symbol search queries, such
as the ones used for code completion. The plan is to follow the proposed design and replace existing implementation by the end of September.</div>
<div><br>
</div>
<div>We are happy to get feedback and comments on the proposal: suggestions are welcome!</div>
<div><br>
</div>
<div>The link to design document: <a href="https://docs.google.com/document/d/1C-A6PGT6TynyaX4PXyExNMiGmJ2jL1UwV91Kyx11gOI/edit?usp=sharing" target="_blank">
https://docs.google.com/document/d/1C-A6PGT6TynyaX4PXyExNMiGmJ2jL1UwV91Kyx11gOI/edit?usp=sharing</a></div>
<div><br>
</div>
<div>Kind regards,</div>
<div>Kirill Bobyrev</div>
</div>
</div>
</div>
_______________________________________________<br>
cfe-dev mailing list<br>
<a href="mailto:cfe-dev@lists.llvm.org" target="_blank">cfe-dev@lists.llvm.org</a><br>
<a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev" rel="noreferrer" target="_blank">http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev</a><br>
</blockquote>
</div>
</div>
</div>
</body>
</html>