<html>
  <head>
    <meta content="text/html; charset=utf-8" http-equiv="Content-Type">
  </head>
  <body bgcolor="#FFFFFF" text="#000000">
    <p><br>
    </p>
    <br>
    <div class="moz-cite-prefix">On 06/ 1/17 06:26 PM, Ilya Biryukov via
      cfe-dev wrote:<br>
    </div>
    <blockquote
cite="mid:CANmbtFchigu+JH4qxvJT918w4hpYoSo=eB3T+QUMEi9o3Ogtmg@mail.gmail.com"
      type="cite">
      <div dir="ltr">
        <div>Other IDEs do that very similarly to CDT, AFAIK.
          Compromising correctness, but getting better performance.</div>
        <div>Reusing modules would be nice, and I wonder if it could
          also be made transparent to the users of the tool (i.e. we
          could have an option 'pretend these headers are modules every
          time you encounter them')<br>
        </div>
        <div>I would expect that to break on most projects, though. Not
          sure if people would be willing to use something that spits
          tons of errors on them.</div>
        <div>Interesting direction for prototyping...</div>
      </div>
    </blockquote>
    As Doug mentioned, surprisingly the tricks with headers in the
    majority of projects give pretty good results :-)<br>
    <br>
    In NetBeans we have similar to CDT headers caching approach.<br>
    <br>
    The only difference is that when we hit #include the second time we
    only check if we can skip indexing,<br>
    But we always do "fair lightweight preprocessing" to keep fair
    context of all possible inner #ifdef/#else/#define directives
    (because they might affect the current file).<br>
    For that we use APT (Abstract Preprocessor Tree) per-file which is
    constant for the file and is created once - similar to clang's PTH
    (Pre-Tokenized headers).<br>
    <br>
    Visiting file's APT we can produce different output based on input
    preprocessor state.<br>
    It can be visited in "light" mode or "produce tokens" mode, but it
    is always gives correct result from the strict compiler point of
    view.<br>
    We also do indexing in parallel and the APT (being immutable) is
    easily shared by index-visitors from all threads.<br>
    Btw stat cache is also reused from all indexing threads with
    appropriate synchronizations.<br>
    <br>
    So in NetBeans we observe that using this tricks (which really looks
    like multi-modules per header file) the majority of projects are in
    very good accuracy + I can also confirm that it gives ~10x speedup.<br>
    <br>
    Hope it helps,<br>
    Vladimir.<br>
    <br>
    <blockquote
cite="mid:CANmbtFchigu+JH4qxvJT918w4hpYoSo=eB3T+QUMEi9o3Ogtmg@mail.gmail.com"
      type="cite">
      <div class="gmail_extra"><br>
        <div class="gmail_quote">On Thu, Jun 1, 2017 at 5:14 PM, David
          Blaikie <span dir="ltr"><<a moz-do-not-send="true"
              href="mailto:dblaikie@gmail.com" target="_blank">dblaikie@gmail.com</a>></span>
          wrote:<br>
          <blockquote class="gmail_quote" style="margin:0 0 0
            .8ex;border-left:1px #ccc solid;padding-left:1ex">
            <div dir="ltr">Not sure this has already been discussed, but
              would it be practical/reasonable to use Clang's modules
              support for this? Might keep the implementation much
              simpler - and perhaps provide an extra incentive for users
              to modularize their build/code which would help their
              actual build tymes (& heck, parsed modules could even
              potentially be reused between indexer and final build -
              making apparent build times /really/ fast)</div>
            <br>
            <div class="gmail_quote">
              <div>
                <div class="h5">
                  <div dir="ltr">On Thu, Jun 1, 2017 at 8:12 AM Doug
                    Schaefer via cfe-dev <<a moz-do-not-send="true"
                      href="mailto:cfe-dev@lists.llvm.org"
                      target="_blank">cfe-dev@lists.llvm.org</a>>
                    wrote:<br>
                  </div>
                </div>
              </div>
              <blockquote class="gmail_quote" style="margin:0 0 0
                .8ex;border-left:1px #ccc solid;padding-left:1ex">
                <div>
                  <div class="h5">
                    <div
style="word-wrap:break-word;color:rgb(0,0,0);font-size:14px;font-family:Calibri,sans-serif">
                      <div>I thought I’d chip in and describe Eclipse
                        CDT’s strategy with header caching. It’s
                        actually a big cheat but the results have proven
                        to be pretty good.</div>
                      <div><br>
                      </div>
                      <div>CDT’s hack actually starts in the
                        preprocessor. If we see a header file has
                        already been indexed, we skip including it. At
                        the back end, we seamlessly use the index or the
                        current symbol table when doing symbol lookup.
                        Symbols that get missed because we skipped
                        header files get picked up out of the index
                        instead. We also do that in the preprocessor to
                        look up missing macros out of the index when
                        doing macro substitution.</div>
                      <div><br>
                      </div>
                      <div>The performance gains were about an order of
                        magnitude and it magically works most of the
                        time with the main issue being header files that
                        get included multiple times affected by
                        different macro values but the effects of that
                        haven’t been major.</div>
                      <div><br>
                      </div>
                      <div>With clang being a real compiler, I had my
                        doubts that you could even do something like
                        this without adding hooks in places the
                        front-end gang might not like. Love to be proven
                        wrong. It really is very hard to keep up with
                        the evolving C++ standard and we could sure use
                        the help clangd could offer.</div>
                      <div><br>
                      </div>
                      <div>Hope that helps,</div>
                      <div>Doug.</div>
                      <div><br>
                      </div>
                      <span
                        id="m_8883615856890106395m_4236914162035532794OLK_SRC_BODY_SECTION">
                        <div
style="font-family:Calibri;font-size:11pt;text-align:left;color:black;BORDER-BOTTOM:medium
                          none;BORDER-LEFT:medium
none;PADDING-BOTTOM:0in;PADDING-LEFT:0in;PADDING-RIGHT:0in;BORDER-TOP:#b5c4df
                          1pt solid;BORDER-RIGHT:medium
                          none;PADDING-TOP:3pt">
                          <span style="font-weight:bold">From: </span>cfe-dev
                          <<a moz-do-not-send="true"
                            href="mailto:cfe-dev-bounces@lists.llvm.org"
                            target="_blank">cfe-dev-bounces@lists.llvm.<wbr>org</a>>
                          on behalf of Ilya Biryukov via cfe-dev <<a
                            moz-do-not-send="true"
                            href="mailto:cfe-dev@lists.llvm.org"
                            target="_blank">cfe-dev@lists.llvm.org</a>><br>
                          <span style="font-weight:bold">Reply-To: </span>Ilya
                          Biryukov <<a moz-do-not-send="true"
                            href="mailto:ibiryukov@google.com"
                            target="_blank">ibiryukov@google.com</a>><br>
                          <span style="font-weight:bold">Date: </span>Thursday,
                          June 1, 2017 at 10:52 AM<br>
                          <span style="font-weight:bold">To: </span>Vladimir
                          Voskresensky <<a moz-do-not-send="true"
                            href="mailto:vladimir.voskresensky@oracle.com"
                            target="_blank">vladimir.voskresensky@oracle.<wbr>com</a>><br>
                          <span style="font-weight:bold">Cc: </span>via
                          cfe-dev <<a moz-do-not-send="true"
                            href="mailto:cfe-dev@lists.llvm.org"
                            target="_blank">cfe-dev@lists.llvm.org</a>></div>
                      </span></div>
                    <div
style="word-wrap:break-word;color:rgb(0,0,0);font-size:14px;font-family:Calibri,sans-serif"><span
id="m_8883615856890106395m_4236914162035532794OLK_SRC_BODY_SECTION">
                        <div
style="font-family:Calibri;font-size:11pt;text-align:left;color:black;BORDER-BOTTOM:medium
                          none;BORDER-LEFT:medium
none;PADDING-BOTTOM:0in;PADDING-LEFT:0in;PADDING-RIGHT:0in;BORDER-TOP:#b5c4df
                          1pt solid;BORDER-RIGHT:medium
                          none;PADDING-TOP:3pt"><br>
                          <span style="font-weight:bold">Subject: </span>Re:
                          [cfe-dev] Adding indexing support to Clangd<br>
                        </div>
                      </span></div>
                    <div
style="word-wrap:break-word;color:rgb(0,0,0);font-size:14px;font-family:Calibri,sans-serif"><span
id="m_8883615856890106395m_4236914162035532794OLK_SRC_BODY_SECTION">
                        <div><br>
                        </div>
                        <blockquote
id="m_8883615856890106395m_4236914162035532794MAC_OUTLOOK_ATTRIBUTION_BLOCKQUOTE"
                          style="BORDER-LEFT:#b5c4df 5 solid;PADDING:0 0
                          0 5;MARGIN:0 0 0 5">
                          <div>
                            <div>
                              <div dir="ltr">Thanks for the insights, I
                                think I get the gist of the idea with
                                the "module" PCH. 
                                <div>One question is: what if the system
                                  headers are included after the user
                                  includes? Then we abandon the PCH
                                  cache and run the parsing from
                                  scratch, right?</div>
                                <div><br>
                                </div>
                                <div>
                                  <div><span style="font-size:12.8px">FileSystemStatCache
                                      that is reused between compilation
                                      units? Sounds like a low-hanging
                                      fruit for indexing, thanks.</span><br>
                                  </div>
                                </div>
                              </div>
                              <div class="gmail_extra"><br>
                                <div class="gmail_quote">On Thu, Jun 1,
                                  2017 at 11:52 AM, Vladimir
                                  Voskresensky <span dir="ltr">
                                    <<a moz-do-not-send="true"
                                      href="mailto:vladimir.voskresensky@oracle.com"
                                      target="_blank">vladimir.voskresensky@oracle.<wbr>com</a>></span>
                                  wrote:<br>
                                  <blockquote class="gmail_quote"
                                    style="margin:0 0 0
                                    .8ex;border-left:1px #ccc
                                    solid;padding-left:1ex">
                                    <div bgcolor="#FFFFFF"
                                      text="#000000">Hi Ilia,<br>
                                      <br>
                                      Sorry for the late reply.<br>
                                      Unfortunately mentioned hacks were
                                      done long time ago and I couldn't
                                      find the changes at the first
                                      glance :-(<br>
                                      <br>
                                      But you can think about reusable
                                      chaned PCHs in the "module" way.<br>
                                      Each system header is a module. <br>
                                      There are special index_headers.c
                                      and index_headers.cpp files which
                                      includes all standard headers.<br>
                                      These files are indexed first and
                                      create "module" per #include.<br>
                                      Module is created once or several
                                      times if preprocessor contexts are
                                      very different like C vs. C++98
                                      vs. C++14.<br>
                                      Then reused.<br>
                                      Of course it could compromise the
                                      accuracy, but for proof of concept
                                      was enough to see that expected
                                      indexing speed can be achieved
                                      theoretically.
                                      <br>
                                      <br>
                                      Btw, another hint: implementing
                                      FileSystemStatCache gave the next
                                      visible speedup. Of course need to
                                      carefully invalidate/update it
                                      when file was modified in IDE or
                                      externally.<br>
                                      So, finally we got just 2x
                                      slowdown, but the accuracy of
                                      "real" compiler. And then as you
                                      know we have started Clank :-)<br>
                                      <br>
                                      Hope it helps,<br>
                                      Vladimir.
                                      <div>
                                        <div
                                          class="m_8883615856890106395m_4236914162035532794h5"><br>
                                          <br>
                                          <div
class="m_8883615856890106395m_4236914162035532794m_5048487057408778332moz-cite-prefix">On
                                            29.05.2017 11:58, Ilya
                                            Biryukov wrote:<br>
                                          </div>
                                          <blockquote type="cite">
                                            <div dir="ltr">Hi Vladimir,
                                              <div><br>
                                              </div>
                                              <div>Thanks for sharing
                                                your experience.</div>
                                              <div><br>
                                                <div class="gmail_extra">
                                                  <div
                                                    class="gmail_quote">
                                                    <blockquote
                                                      class="gmail_quote"
                                                      style="margin:0px
                                                      0px 0px
                                                      0.8ex;border-left:1px
                                                      solid
                                                      rgb(204,204,204);padding-left:1ex">
                                                      <div
                                                        bgcolor="#FFFFFF">We
                                                        did such
                                                        measurements
                                                        when evaluated
                                                        clang as a
                                                        technology to be
                                                        used in NetBeans
                                                        C/C++, I don't
                                                        remember the
                                                        exact absolute
                                                        numbers now, but
                                                        the conclusion
                                                        was: </div>
                                                    </blockquote>
                                                    <blockquote
                                                      class="gmail_quote"
                                                      style="margin:0px
                                                      0px 0px
                                                      0.8ex;border-left:1px
                                                      solid
                                                      rgb(204,204,204);padding-left:1ex">
                                                      <div
                                                        bgcolor="#FFFFFF">to
                                                        be on par with
                                                        the existing
                                                        NetBeans speed
                                                        we have to use
                                                        different
                                                        caching,
                                                        otherwise it was
                                                        like 10 times
                                                        slower.</div>
                                                    </blockquote>
                                                    <div>It's a good
                                                      reason to focus on
                                                      that issue from
                                                      the very start
                                                      than. Would be
                                                      nice to have some
                                                      exact
                                                      measurements,
                                                      though. (i.e. on
                                                      LLVM).</div>
                                                    <div>Just to know
                                                      how slow exactly
                                                      was it.</div>
                                                    <div><br>
                                                    </div>
                                                    <blockquote
                                                      class="gmail_quote"
                                                      style="margin:0px
                                                      0px 0px
                                                      0.8ex;border-left:1px
                                                      solid
                                                      rgb(204,204,204);padding-left:1ex">
                                                      <div
                                                        bgcolor="#FFFFFF">+1.
                                                        Btw, may be It
                                                        is worth to set
                                                        some
                                                        expectations
                                                        what is
                                                        available during
                                                        and after
                                                        initial index
                                                        phase.<br>
                                                        I.e. during
                                                        initial phase
                                                        you'd probably
                                                        like to have
                                                        navigation for
                                                        file opened in
                                                        editor and can
                                                        work in
                                                        functions
                                                        bodies.<br>
                                                      </div>
                                                    </blockquote>
                                                    <div>We definitely
                                                      want
                                                      diagnostics/completions
                                                      for the currently
                                                      open file to be
                                                      available. Good
                                                      point, we
                                                      definitely want to
                                                      explicitly name
                                                      the available
                                                      features in the
                                                      docs/discussions.</div>
                                                    <div><br>
                                                    </div>
                                                    <blockquote
                                                      class="gmail_quote"
                                                      style="margin:0px
                                                      0px 0px
                                                      0.8ex;border-left:1px
                                                      solid
                                                      rgb(204,204,204);padding-left:1ex">
                                                      <div
                                                        bgcolor="#FFFFFF">As
                                                        to initial
                                                        indexing:<br>
                                                        Using PTH (not
                                                        PCH) gave
                                                        significant
                                                        speedup.</div>
                                                    </blockquote>
                                                    <blockquote
                                                      class="gmail_quote"
                                                      style="margin:0px
                                                      0px 0px
                                                      0.8ex;border-left:1px
                                                      solid
                                                      rgb(204,204,204);padding-left:1ex">
                                                      <div
                                                        bgcolor="#FFFFFF">Skipping
                                                        bodies gave
                                                        significant
                                                        speedup, but you
                                                        miss the
                                                        references and
                                                        later have to
                                                        reindex bodies
                                                        on demand.<br>
                                                        Using chainged
                                                        PCH gave the
                                                        next visible
                                                        speedup.<br>
                                                      </div>
                                                    </blockquote>
                                                    <blockquote
                                                      class="gmail_quote"
                                                      style="margin:0px
                                                      0px 0px
                                                      0.8ex;border-left:1px
                                                      solid
                                                      rgb(204,204,204);padding-left:1ex">
                                                      <div
                                                        bgcolor="#FFFFFF">Of
                                                        course we had to
                                                        made some hacks
                                                        for PCHs to be
                                                        more often
                                                        "reusable"
                                                        (comparing to
                                                        strict compiler
                                                        rule) and keep
                                                        multiple
                                                        versions. In
                                                        average 2: one
                                                        for C and one
                                                        for C++ parse
                                                        context.<br>
                                                        Also there is a
                                                        difference
                                                        between system
                                                        headers and
                                                        projects
                                                        headers, so
                                                        systems' can be
                                                        cached more
                                                        aggressively.
                                                        <br>
                                                      </div>
                                                    </blockquote>
                                                    <div>Is this work
                                                      open-source? The
                                                      interesting part
                                                      is how to "reuse"
                                                      the PCH for a
                                                      header that's
                                                      included in a
                                                      different order. </div>
                                                    <div>I.e. is there a
                                                      way to reuse some
                                                      cached
                                                      information(PCH,
                                                      or anything else)
                                                      for <map>
                                                      and <vector>
                                                      when parsing these
                                                      two files:<br>
                                                    </div>
                                                    <div>```</div>
                                                    <div>// foo.cpp</div>
                                                    <div>#include
                                                      <vector></div>
                                                    <div>#include
                                                      <map></div>
                                                    <div>...</div>
                                                    <div><br>
                                                    </div>
                                                    <div>// bar.cpp</div>
                                                    <div>#include
                                                      <map></div>
                                                    <div>#include
                                                      <vector></div>
                                                    <div>....</div>
                                                    <div>```</div>
                                                  </div>
                                                  <div><br>
                                                  </div>
                                                  -- <br>
                                                  <div
class="m_8883615856890106395m_4236914162035532794m_5048487057408778332gmail_signature">
                                                    <div dir="ltr">
                                                      <div>
                                                        <div dir="ltr">
                                                          <div>Regards,</div>
                                                          <div>Ilya
                                                          Biryukov</div>
                                                        </div>
                                                      </div>
                                                    </div>
                                                  </div>
                                                </div>
                                              </div>
                                            </div>
                                          </blockquote>
                                          <br>
                                        </div>
                                      </div>
                                    </div>
                                  </blockquote>
                                </div>
                                <br>
                                <br clear="all">
                                <div><br>
                                </div>
                                -- <br>
                                <div
                                  class="m_8883615856890106395m_4236914162035532794gmail_signature"
                                  data-smartmail="gmail_signature">
                                  <div dir="ltr">
                                    <div>
                                      <div dir="ltr">
                                        <div>Regards,</div>
                                        <div>Ilya Biryukov</div>
                                      </div>
                                    </div>
                                  </div>
                                </div>
                              </div>
                            </div>
                          </div>
                        </blockquote>
                      </span></div>
                  </div>
                </div>
                ______________________________<wbr>_________________<span
                  class=""><br>
                  cfe-dev mailing list<br>
                  <a moz-do-not-send="true"
                    href="mailto:cfe-dev@lists.llvm.org" target="_blank">cfe-dev@lists.llvm.org</a><br>
                  <a moz-do-not-send="true"
                    href="http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev"
                    rel="noreferrer" target="_blank">http://lists.llvm.org/cgi-bin/<wbr>mailman/listinfo/cfe-dev</a><br>
                </span></blockquote>
            </div>
          </blockquote>
        </div>
        <br>
        <br clear="all">
        <div><br>
        </div>
        -- <br>
        <div class="gmail_signature" data-smartmail="gmail_signature">
          <div dir="ltr">
            <div>
              <div dir="ltr">
                <div>Regards,</div>
                <div>Ilya Biryukov</div>
              </div>
            </div>
          </div>
        </div>
      </div>
      <br>
      <fieldset class="mimeAttachmentHeader"></fieldset>
      <br>
      <pre wrap="">_______________________________________________
cfe-dev mailing list
<a class="moz-txt-link-abbreviated" href="mailto:cfe-dev@lists.llvm.org">cfe-dev@lists.llvm.org</a>
<a class="moz-txt-link-freetext" href="http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev">http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev</a>
</pre>
    </blockquote>
    <br>
  </body>
</html>