<html><head><meta http-equiv="Content-Type" content="text/html; charset=utf-8"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;" class=""><br class=""><div><br class=""><blockquote type="cite" class=""><div class="">On Oct 23, 2018, at 12:10 PM, Sam McCall <<a href="mailto:sammccall@google.com" class="">sammccall@google.com</a>> wrote:</div><br class="Apple-interchange-newline"><div class=""><div dir="auto" class=""><div class="gmail_quote" dir="auto"><div dir="ltr" class="">On Tue, Oct 23, 2018, 20:34 Argyrios Kyrtzidis <<a href="mailto:akyrtzi@gmail.com" class="">akyrtzi@gmail.com</a>> wrote:<br class=""></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div style="word-wrap:break-word;line-break:after-white-space" class=""><div class=""><div class="">Since you started looking at background indexing functionality here’s some feedback based on our experiences, which will also provide some context for why we pursued index-while-building.</div></div></div></blockquote></div><div dir="auto" class="">Yes, when looking at your designs we saw there are huge benefits to index-while-build when you can use the "real" build.</div><div dir="auto" class=""><br class=""></div><div dir="auto" class="">This relies on on the build toolchain being a recent enough/version-locked clang so that it produces usable IWB output.</div><div dir="auto" class="">I think this is ~always the case for Xcode+mac and ~never for cmake+linux type projects. The system compiler is probably GCC most of the time!</div><div dir="auto" class=""><br class=""></div><div dir="auto" class="">So we figured we needed to have a solid story without relying on actual index-while-build. With some sadness!</div><div dir="auto" class=""><span style="font-family:sans-serif" class="">(We actually do have a build-integrated indexer internally at Google where we control the toolchain)</span></div></div></div></blockquote><div><br class=""></div><div>To clarify a bit more, we still have a ‘background indexer’ but what it does is that it essentially invokes clang with ‘-fsyntax-only’ and ‘-index-store-path’ in order to get the index data, the same data that we would get if the file is built.</div><div>We have similar situations where a project doesn’t actually produce index data during building, e.g. CMake-generated or the Unreal project, in which case we fallback to getting the data from the background indexer.</div><div>That means that even if the system compiler is GCC you could still use the mechanism by essentially doing a ‘-fsyntax-only’ “build" in the background. However if the project is able to use clang for building then it can naturally take advantage of generating the same data during a build.</div><br class=""><blockquote type="cite" class=""><div class=""><div dir="auto" class=""><div dir="auto" class=""><span style="font-family:sans-serif" class=""><br class=""></span></div><div class="gmail_quote" dir="auto"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div style="word-wrap:break-word;line-break:after-white-space" class=""><div class=""><div class="">We haven’t looked into Dex in details but it <i class="">seems</i> that it could play the role of what we currently use LMDB for, speed up queries to efficiently figure out where certain information resides in the record files, generated by index-while-building.</div></div></div></blockquote></div><div dir="auto" class=""><br class=""></div><div dir="auto" class="">There's a couple of layers: the token/posting list/iterator stuff is fairly generic search engine machinery.</div><div dir="auto" class="">If the lookup you need is something like "top N items whose properties satisfy some boolean expression tree" it might fit the bill.</div><div dir="auto" class=""><br class=""></div><div dir="auto" class="">In top of that, the actual SymbolIndex implementation implements the fuzzyFind operation we use for code completion. That might be useful if it's harder to build such things on LMDB.</div><div dir="auto" class=""><br class=""></div><div dir="auto" class="">The biggest limitation is these structures aren't incremental - we have to rebuild when the data changes. So in practice we overlay frequently-changing data (memindex).</div><div class="gmail_quote" dir="auto"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div style="word-wrap:break-word;line-break:after-white-space" class=""><div class=""><blockquote type="cite" class=""><div class=""><div dir="ltr" class=""><div dir="ltr" class=""><div class="">One question I have is a practical one - I'm sure changes are needed to clangd, are these likely to happen upstream or in a fork/merge cycle?</div></div></div></div></blockquote><div class=""><br class=""></div><div class="">CC’ed AlexL and JanK, they can speak more about this. Beyond clangd, I’d like to also mention that we’ll be resuming our upstreaming effort for the index-while-building patches.</div></div></div></blockquote></div><div class="gmail_quote" dir="auto"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div style="word-wrap:break-word;line-break:after-white-space" class=""><div class=""><br class=""><blockquote type="cite" class=""><div class=""><div dir="ltr" class=""><div dir="ltr" class=""><div class=""><br class=""></div><div class="">Looking forward to seeing more details!</div><div class="">Cheers, Sam</div></div></div><br class=""><div class="gmail_quote"><div dir="ltr" class="">On Tue, Oct 23, 2018 at 8:32 AM Argyrios Kyrtzidis via clangd-dev <<a href="mailto:clangd-dev@lists.llvm.org" target="_blank" rel="noreferrer" class="">clangd-dev@lists.llvm.org</a>> wrote:<br class=""></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div style="word-wrap:break-word;line-break:after-white-space" class="">Hey all,<br class=""><br class="">We've recently announced that we'll be starting a new open-source project for an LSP language service supporting Swift and C-family languages, see more details in the announcement post (<a href="https://forums.swift.org/t/new-lsp-language-service-supporting-swift-and-c-family-languages-for-any-editor-and-platform" target="_blank" rel="noreferrer" class="">https://forums.swift.org/t/new-lsp-language-service-supporting-swift-and-c-family-languages-for-any-editor-and-platform</a>). I wanted to also mention additional details that relate to Clangd.<br class=""><br class="">Currently, for our C-family support in Xcode (code-completion, clang AST queries) we use libclang, but for the new LSP service we will switch to using Clangd. We will also open-source a C++ library for global index queries, which is built on top of LMDB (<a href="https://symas.com/lmdb" target="_blank" rel="noreferrer" class="">https://symas.com/lmdb</a>). The functionality of this library is described by Nathan in his Index-While-Building design document (<a href="https://docs.google.com/document/d/1cH2sTpgSnJZCkZtJl1aY-rzy4uGPcrI-6RrUpdATO2Q" target="_blank" rel="noreferrer" class="">https://docs.google.com/document/d/1cH2sTpgSnJZCkZtJl1aY-rzy4uGPcrI-6RrUpdATO2Q</a>), specifically in the 'Using the index store' section.<br class=""><br class="">Let me elaborate a bit more on how we use this library. From Clang (and Swift) we get raw index data files, either directly from building or from invoking clang for background indexing. These data record files are designed to be efficient to write and update, ensuring that record files for headers are only written once, so that index-while-building has minimal overhead. But they are not designed to do efficient global queries (give me all symbol occurrences of this symbol USR). To accommodate this we use this database library which is a lightweight index layer on top of the raw index records. It reads the raw index data files and populates a key-value database that enables efficient global queries (it essentially determines what raw index record files contain the relevant information and retrieves the data).<br class=""><br class="">In our design for having full cross-language support for Swift and Clang languages (e.g. call-hierarchy across languages), we prefer to have a language-independent indexing component that is layered on top of the compiler-specific support (Clang/Clangd and Swift/sourcekitd). That means that our LSP service will contain an indexing and global refactoring engine and it will delegate to Clangd for clang-specific document queries, like code-completion.<br class=""><br class="">I understand that Clangd is intended to be a self-contained language service, that includes functionality for global index queries along with document-specific queries, but we believe we could still collaborate on common infrastructure shared by both Clangd and our new cross-language LSP service. See AlexL's previous post about how we intend to use Clangd, <a href="https://lists.llvm.org/pipermail/cfe-dev/2018-April/057668.html" target="_blank" rel="noreferrer" class="">https://lists.llvm.org/pipermail/cfe-dev/2018-April/057668.html</a> and what kind of improvements we want to make.<br class=""><br class="">Once we have the repositories up, you'll be able to check out our overall design in more detail, and in the meantime I'd be happy to hear any feedback or questions you may have!</div>_______________________________________________<br class="">

clangd-dev mailing list<br class="">

<a href="mailto:clangd-dev@lists.llvm.org" target="_blank" rel="noreferrer" class="">clangd-dev@lists.llvm.org</a><br class="">

<a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/clangd-dev" rel="noreferrer noreferrer" target="_blank" class="">http://lists.llvm.org/cgi-bin/mailman/listinfo/clangd-dev</a><br class="">

</blockquote></div>

</div></blockquote></div><br class=""><br class=""></div></blockquote></div></div>

</div></blockquote></div><br class=""></body></html>