<div dir="ltr">Thanks for the insights, I think I get the gist of the idea with the "module" PCH. <div>One question is: what if the system headers are included after the user includes? Then we abandon the PCH cache and run the parsing from scratch, right?</div><div><br></div><div><div><span style="font-size:12.8px">FileSystemStatCache that is reused between compilation units? Sounds like a low-hanging fruit for indexing, thanks.</span><br></div></div></div><div class="gmail_extra"><br><div class="gmail_quote">On Thu, Jun 1, 2017 at 11:52 AM, Vladimir Voskresensky <span dir="ltr"><<a href="mailto:vladimir.voskresensky@oracle.com" target="_blank">vladimir.voskresensky@oracle.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div bgcolor="#FFFFFF" text="#000000">
Hi Ilia,<br>
<br>
Sorry for the late reply.<br>
Unfortunately mentioned hacks were done long time ago and I couldn't
find the changes at the first glance :-(<br>
<br>
But you can think about reusable chaned PCHs in the "module" way.<br>
Each system header is a module. <br>
There are special index_headers.c and index_headers.cpp files which
includes all standard headers.<br>
These files are indexed first and create "module" per #include.<br>
Module is created once or several times if preprocessor contexts are
very different like C vs. C++98 vs. C++14.<br>
Then reused.<br>
Of course it could compromise the accuracy, but for proof of concept
was enough to see that expected indexing speed can be achieved
theoretically. <br>
<br>
Btw, another hint: implementing FileSystemStatCache gave the next
visible speedup. Of course need to carefully invalidate/update it
when file was modified in IDE or externally.<br>
So, finally we got just 2x slowdown, but the accuracy of "real"
compiler. And then as you know we have started Clank :-)<br>
<br>
Hope it helps,<br>
Vladimir.<div><div class="h5"><br>
<br>
<div class="m_5048487057408778332moz-cite-prefix">On 29.05.2017 11:58, Ilya Biryukov
wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr">Hi Vladimir,
<div><br>
</div>
<div>Thanks for sharing your experience.</div>
<div><br>
<div class="gmail_extra">
<div class="gmail_quote">
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div bgcolor="#FFFFFF">We did such measurements when
evaluated clang as a technology to be used in NetBeans
C/C++, I don't remember the exact absolute numbers
now, but the conclusion was: </div>
</blockquote>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div bgcolor="#FFFFFF"> to be on par with the existing
NetBeans speed we have to use different caching,
otherwise it was like 10 times slower.</div>
</blockquote>
<div>It's a good reason to focus on that issue from the
very start than. Would be nice to have some exact
measurements, though. (i.e. on LLVM).</div>
<div>Just to know how slow exactly was it.</div>
<div><br>
</div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div bgcolor="#FFFFFF"> +1. Btw, may be It is worth to
set some expectations what is available during and
after initial index phase.<br>
I.e. during initial phase you'd probably like to have
navigation for file opened in editor and can work in
functions bodies.<br>
</div>
</blockquote>
<div>We definitely want diagnostics/completions for the
currently open file to be available. Good point, we
definitely want to explicitly name the available
features in the docs/discussions.</div>
<div><br>
</div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div bgcolor="#FFFFFF">As to initial indexing:<br>
Using PTH (not PCH) gave significant speedup.</div>
</blockquote>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div bgcolor="#FFFFFF"> Skipping bodies gave significant
speedup, but you miss the references and later have to
reindex bodies on demand.<br>
Using chainged PCH gave the next visible speedup.<br>
</div>
</blockquote>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div bgcolor="#FFFFFF">Of course we had to made some
hacks for PCHs to be more often "reusable" (comparing
to strict compiler rule) and keep multiple versions.
In average 2: one for C and one for C++ parse context.<br>
Also there is a difference between system headers and
projects headers, so systems' can be cached more
aggressively. <br>
</div>
</blockquote>
<div>Is this work open-source? The interesting part is how
to "reuse" the PCH for a header that's included in a
different order. </div>
<div>I.e. is there a way to reuse some cached
information(PCH, or anything else) for <map> and
<vector> when parsing these two files:<br>
</div>
<div>```</div>
<div>// foo.cpp</div>
<div>#include <vector></div>
<div>#include <map></div>
<div>...</div>
<div><br>
</div>
<div>// bar.cpp</div>
<div>#include <map></div>
<div>#include <vector></div>
<div>....</div>
<div>```</div>
</div>
<div><br>
</div>
-- <br>
<div class="m_5048487057408778332gmail_signature">
<div dir="ltr">
<div>
<div dir="ltr">
<div>Regards,</div>
<div>Ilya Biryukov</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</blockquote>
<br>
</div></div></div>
</blockquote></div><br><br clear="all"><div><br></div>-- <br><div class="gmail_signature" data-smartmail="gmail_signature"><div dir="ltr"><div><div dir="ltr"><div>Regards,</div><div>Ilya Biryukov</div></div></div></div></div>
</div>