[cfe-dev] Adding indexing support to Clangd

Thu Jun 1 08:14:52 PDT 2017

Not sure this has already been discussed, but would it be
practical/reasonable to use Clang's modules support for this? Might keep
the implementation much simpler - and perhaps provide an extra incentive
for users to modularize their build/code which would help their actual
build tymes (& heck, parsed modules could even potentially be reused
between indexer and final build - making apparent build times /really/ fast)

On Thu, Jun 1, 2017 at 8:12 AM Doug Schaefer via cfe-dev <
cfe-dev at lists.llvm.org> wrote:

> I thought I’d chip in and describe Eclipse CDT’s strategy with header
> caching. It’s actually a big cheat but the results have proven to be pretty
> good.
>
> CDT’s hack actually starts in the preprocessor. If we see a header file
> has already been indexed, we skip including it. At the back end, we
> seamlessly use the index or the current symbol table when doing symbol
> lookup. Symbols that get missed because we skipped header files get picked
> up out of the index instead. We also do that in the preprocessor to look up
> missing macros out of the index when doing macro substitution.
>
> The performance gains were about an order of magnitude and it magically
> works most of the time with the main issue being header files that get
> included multiple times affected by different macro values but the effects
> of that haven’t been major.
>
> With clang being a real compiler, I had my doubts that you could even do
> something like this without adding hooks in places the front-end gang might
> not like. Love to be proven wrong. It really is very hard to keep up with
> the evolving C++ standard and we could sure use the help clangd could offer.
>
> Hope that helps,
> Doug.
>
> From: cfe-dev <cfe-dev-bounces at lists.llvm.org> on behalf of Ilya Biryukov
> via cfe-dev <cfe-dev at lists.llvm.org>
> Reply-To: Ilya Biryukov <ibiryukov at google.com>
> Date: Thursday, June 1, 2017 at 10:52 AM
> To: Vladimir Voskresensky <vladimir.voskresensky at oracle.com>
> Cc: via cfe-dev <cfe-dev at lists.llvm.org>
>
> Subject: Re: [cfe-dev] Adding indexing support to Clangd
>
> Thanks for the insights, I think I get the gist of the idea with the
> "module" PCH.
> One question is: what if the system headers are included after the user
> includes? Then we abandon the PCH cache and run the parsing from scratch,
> right?
>
> FileSystemStatCache that is reused between compilation units? Sounds like
> a low-hanging fruit for indexing, thanks.
>
> On Thu, Jun 1, 2017 at 11:52 AM, Vladimir Voskresensky <
> vladimir.voskresensky at oracle.com> wrote:
>
>> Hi Ilia,
>>
>> Sorry for the late reply.
>> Unfortunately mentioned hacks were done long time ago and I couldn't find
>> the changes at the first glance :-(
>>
>> But you can think about reusable chaned PCHs in the "module" way.
>> Each system header is a module.
>> There are special index_headers.c and index_headers.cpp files which
>> includes all standard headers.
>> These files are indexed first and create "module" per #include.
>> Module is created once or several times if preprocessor contexts are very
>> different like C vs. C++98 vs. C++14.
>> Then reused.
>> Of course it could compromise the accuracy, but for proof of concept was
>> enough to see that expected indexing speed can be achieved theoretically.
>>
>> Btw, another hint: implementing FileSystemStatCache gave the next visible
>> speedup. Of course need to carefully invalidate/update it when file was
>> modified in IDE or externally.
>> So, finally we got just 2x slowdown, but the accuracy of "real" compiler.
>> And then as you know we have started Clank :-)
>>
>> Hope it helps,
>> Vladimir.
>>
>>
>> On 29.05.2017 11:58, Ilya Biryukov wrote:
>>
>> Hi Vladimir,
>>
>> Thanks for sharing your experience.
>>
>> We did such measurements when evaluated clang as a technology to be used
>>> in NetBeans C/C++, I don't remember the exact absolute numbers now, but the
>>> conclusion was:
>>>
>> to be on par with the existing NetBeans speed we have to use different
>>> caching, otherwise it was like 10 times slower.
>>>
>> It's a good reason to focus on that issue from the very start than. Would
>> be nice to have some exact measurements, though. (i.e. on LLVM).
>> Just to know how slow exactly was it.
>>
>> +1. Btw, may be It is worth to set some expectations what is available
>>> during and after initial index phase.
>>> I.e. during initial phase you'd probably like to have navigation for
>>> file opened in editor and can work in functions bodies.
>>>
>> We definitely want diagnostics/completions for the currently open file to
>> be available. Good point, we definitely want to explicitly name the
>> available features in the docs/discussions.
>>
>> As to initial indexing:
>>> Using PTH (not PCH) gave significant speedup.
>>>
>> Skipping bodies gave significant speedup, but you miss the references and
>>> later have to reindex bodies on demand.
>>> Using chainged PCH gave the next visible speedup.
>>>
>> Of course we had to made some hacks for PCHs to be more often "reusable"
>>> (comparing to strict compiler rule) and keep multiple versions. In average
>>> 2: one for C and one for C++ parse context.
>>> Also there is a difference between system headers and projects headers,
>>> so systems' can be cached more aggressively.
>>>
>> Is this work open-source? The interesting part is how to "reuse" the PCH
>> for a header that's included in a different order.
>> I.e. is there a way to reuse some cached information(PCH, or anything
>> else) for <map> and <vector> when parsing these two files:
>> ```
>> // foo.cpp
>> #include <vector>
>> #include <map>
>> ...
>>
>> // bar.cpp
>> #include <map>
>> #include <vector>
>> ....
>> ```
>>
>> --
>> Regards,
>> Ilya Biryukov
>>
>>
>>
>
>
> --
> Regards,
> Ilya Biryukov
>
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20170601/56f28337/attachment.html>