[cfe-dev] RFC: Upstreaming index-while-building

Argyrios Kyrtzidis via cfe-dev cfe-dev at lists.llvm.org
Fri Mar 1 08:36:52 PST 2019


Hi Dmitri,

Could you clarify, it is my impression that clangd is using the same indexing symbol generation mechanism as what IWB (index-while-building) is using as source (the AST visitation of lib/Index and related index consumer). I assume clangd is using that as source of index symbols to process and then generate its higher-level data structures, is this correct ?

IWB aims to be essentially just an efficient serialization mechanism for that same data, to generate the same raw data during a build with minimal overhead. It purposefully doesn’t do any higher level processing of the symbols, e.g. anything that would include merging of index data across files, that would be a non-starter to do during building. The design is that IWB serializes the same data, as what lib/Index generates for a file, during a build and then a higher-level indexing mechanism can use that raw data as a source for more sophisticated processing (e.g. clangd’s data structures or a database for cross-file queries).

What seems to me as a great thing to explore would be that clangd uses the raw data that IWB generates as a source of index symbols, so that it can take advantage of the data getting generated during a build and not have to create and process all the translation unit ASTs from the user’s project separately to create its data structures.
What do you think, does this make sense ?

> On Feb 28, 2019, at 1:17 AM, Dmitri Gribenko via cfe-dev <cfe-dev at lists.llvm.org> wrote:
> 
> Hi Jan,
> 
> I'm very happy that you're picking up this work again!
> 
> Since clangd team and the Apple source tools team talked last time,
> clangd became a lot more full-featured, and I think there's a lot of
> overlap between index-while-building and indexing that is already
> built in clangd in the open source repository.
> 
> I would like to suggest that we figure out a way to unify these
> indexing implementations.  The value proposition for the community is
> that there is no feature duplication.  The value proposition for you
> is that index-while-building would be able to reuse the infrastructure
> that clangd has already built.  For example, global code completion,
> or the fast index that supports complex and fuzzy queries (Dex,
> http://lists.llvm.org/pipermail/cfe-dev/2018-July/058487.html).
> 
> My strawman proposal is that index-while-building should use the same
> data structures as clangd for representing symbols -- please take a
> look at clang-tools-extra/clangd/index/Index.h.  It would be also
> great if index-while-building could reuse the on-disk serialization
> format for the symbol information --
> clang-tools-extra/clangd/index/Serialization.h.
> 
> These are central to the indexing system, and I think we should reuse
> them.  Doing so would allow you to reuse other indexing
> infrastructure, and infrastructure build on top of indexing, like Dex.
> I'm afraid if index-while-building does not speak the same data
> structures for symbols, it is unlikely that the two implementations
> will ever converge.
> 
> What do you think?
> 
> Dmitri
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev




More information about the cfe-dev mailing list