[cfe-dev] Adding indexing support to Clangd
Alex L via cfe-dev
cfe-dev at lists.llvm.org
Fri Aug 11 09:16:39 PDT 2017
On 11 August 2017 at 17:09, Doug Schaefer via cfe-dev <
cfe-dev at lists.llvm.org> wrote:
> Sorry gang, Outlook isn’t treating this thread very well. I’ll have to top
> post my response.
>
>
>
> The CDT indexer and database were written in 2005 so the data is long gone
> and we’re talking about different database technologies anyway. All I
> remember was it was slower by multiple orders of magnitude on writes. And
> that stopped my investigation right there. At the time I was trying to get
> a full index of the Firefox source down from an hour to a few minutes and
> incremental indexing times down to less than a second (thanks to the header
> caching). The DB was taking me in the other direction.
>
>
>
> I’m just not sure architecting for simple lookups is the right approach.
> Just consider probably the most common use case for clangd, content assist.
> Given an offset into the file, determine the context/scope of the content
> and a prefix. You then need to find all matches for that prefix. That’s by
> no means a simple lookup given all the declarations, type hierarchies,
> scope hierarchies in play in that context. Finding references is another
> good one that also requires scope analysis to make sure you’re actually
> returning references to that exact definition and not just something that
> happens to have the same name. The CDT index has links back and forth
> between these things.
>
>
Clang shouldn't query by name when looking up references, instead USRs
should be used. The USRs deal with scopes correctly (Hence the U in USR).
>
>
> Again, I’m just relaying my experience haven built one of these things
> before. And it was a long time ago. All avenues are worth exploring.
>
>
>
> Doug.
>
>
>
>
>
> *From:* Manuel Klimek [mailto:klimek at google.com]
> *Sent:* Friday, August 11, 2017 10:59 AM
>
> *To:* Doug Schaefer <dschaefer at blackberry.com>; David Chisnall <
> David.Chisnall at cl.cam.ac.uk>; Marc-André Laperle <
> marc-andre.laperle at ericsson.com>
> *Cc:* via cfe-dev <cfe-dev at lists.llvm.org>
> *Subject:* Re: [cfe-dev] Adding indexing support to Clangd
>
>
>
> On Fri, Aug 11, 2017 at 4:50 PM Doug Schaefer <dschaefer at blackberry.com>
> wrote:
>
>
>
>
>
> *From:* Manuel Klimek [mailto:klimek at google.com]
> *Sent:* Friday, August 11, 2017 4:54 AM
> *To:* Doug Schaefer <dschaefer at blackberry.com>; David Chisnall <
> David.Chisnall at cl.cam.ac.uk>; Marc-André Laperle <
> marc-andre.laperle at ericsson.com>
> *Cc:* via cfe-dev <cfe-dev at lists.llvm.org>
>
>
> *Subject:* Re: [cfe-dev] Adding indexing support to Clangd
>
> On Thu, Aug 10, 2017 at 5:57 PM Doug Schaefer via cfe-dev <
> cfe-dev at lists.llvm.org> wrote:
>
> > -----Original Message-----
> > From: Dr D. Chisnall [mailto:dc552 at hermes.cam.ac.uk] On Behalf Of David
> > Chisnall
> > Sent: Thursday, August 10, 2017 6:10 AM
> > To: Marc-André Laperle <marc-andre.laperle at ericsson.com>
> > Cc: via cfe-dev <cfe-dev at lists.llvm.org>; zeratul976 at hotmail.com; Doug
> > Schaefer <dschaefer at blackberry.com>
> > Subject: Re: [cfe-dev] Adding indexing support to Clangd
> >
> > On 8 Aug 2017, at 18:52, Marc-André Laperle via cfe-dev <cfe-
> > dev at lists.llvm.org> wrote:
> > >
> > > --ClangdIndexStorage--
> > > malloc-like interface that allocates/frees data blocks of variable
> sizes on
> > disk. The blocks can contain ints, shorts, strings, pointers (i.e. file
> offsets),
> > etc. The data is cached in 4K pieces so that local and repeated accesses
> are all
> > done quickly, in memory.
> > > Clangd mallocs and writes its index model objects using this.
> > >
> > > --BTree--
> > > A pretty classic BTree implementation. Used to speed up lookup (symbol
> > names, file names). It allocates its nodes using ClangdIndexStorage
> therefore
> > it is stored on disk. Keys are actually records in ClangdIndexStorage so
> you
> > can really think of the BTree as a collection of sorted pointers (sorted
> > according to a provided comparator).
> >
> > This sounds very like bdb. Is there a reason that we’re reimplementing a
> > large chunk of bdb, rather than just using it (or using something like
> sqlite for
> > the index storage)?
>
> It looks like Marc-Andre is following our work on the Eclipse CDT indexer.
> We found databases to be just too slow.
>
>
>
> That is somewhat surprising. Which db's did you benchmark against?
>
>
>
>
>
> This was many years ago and we were trying Apache Derby (since Eclipse is
> written in Java). And at the time the biggest issue was write performance
> totally destroying any gains we made with header caching. But that may have
> just been Derby.
>
>
>
> Though I think the bigger issue is that the language model is a graph
> which doesn’t fit naturally in tables. You’ll end up spending more time
> fighting that than building out your own data structures. But that’s just
> my experience.
>
>
>
> Yea, the language model not fitting into tables well does match my
> intuition, but for simple queries, I'd have expected a db to be fast enough
> with room to spare. Would be curious about more details / numbers.
>
>
>
>
>
>
>
>
>
> Also, forcing the C++ language model to fit into a bunch of tables or
> key/value pairs just didn't make that much sense in the end. The data is
> very complex.
>
> Instead it's better to think of this as a big graph data structure that
> just happens to be backed by memory mapped file store. Clients generally
> deal directly with pointers into that data structure and follow pointers in
> the graph. The BTrees help speed up that navigation at various nodes in the
> graph. It's super fast.
>
> Doug.
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>
>
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20170811/d8aab2a6/attachment.html>
More information about the cfe-dev
mailing list