<div dir="ltr"><div dir="ltr">On Tue, Nov 9, 2021 at 3:40 AM Andrew Tomazos <<a href="mailto:andrewtomazos@gmail.com">andrewtomazos@gmail.com</a>> wrote:<br></div><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div dir="ltr">On Mon, Nov 8, 2021 at 11:18 AM Haojian Wu <<a href="mailto:hokein@google.com" target="_blank">hokein@google.com</a>> wrote:<br></div><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div dir="ltr">IDE use cases (for clangd)<br></div><div class="gmail_quote"><div>-  provide code-folding, outline, syntax highlighting, selection features without a long "warmup" time;</div><div>-  a fast index to provides approximate results;</div><div><br></div><div>Other use cases we aim to support:</div><div>- smart diff and merge tool for C++ code;</div><div>- a fast linter, a cpplint replacement, with clang-tidy-like extensibility;</div><div>- syntactic grep/sed tools;</div></div></div></blockquote><div><br></div><div>* I don't know what "fast index to provide approximate results" means.  Results of what?  Do you mean generating an index?  What will the index be used for?</div></div></div></blockquote><div>clangd has a symbol index to enable codebase-wide operations. (see <a href="https://github.com/llvm/llvm-project/blob/main/clang-tools-extra/clangd/index/Index.h">SymbolIndex</a> and some <a href="https://clangd.llvm.org/design/indexing">documentation</a>)</div><div>These include:</div><div><ul><li>go-to-definition: finding a definition associated with a declaration visible in the AST</li><li>code completion: for contexts where the AST cannot provide all results efficiently, such as namespace scopes (including results from non-included headers)</li><li>cross-references: finding references from files that are not part of the current AST</li></ul><div>Today this index is built from ASTs in various ways (see docs), which takes many hours for large codebases (on machines too slow to build).</div><div>Most results are missing for a long time. Many users turn off indexing (e.g. to avoid battery drain) and the results stay missing. If compile flag metadata is missing for the project, these features don't work at all.</div></div><div><br></div><div>The idea for clangd is to augment (not replace) this index with a pseudo-parser based index that processes each file once. It would be halfway between the AST index and grep. This index would provide the same operations with lower fidelity, and would be replaced by the AST-based index as it completes.</div><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div class="gmail_quote"><div>* Syntax highlighting is the only use case of those listed that can tolerate inaccuracy.  For the rest, a correct parse will be more productive.  The trouble is that if people start depending on these features in their workflow, when they fail (and they often will) it will be very disruptive.  The cost of the disruption outweighs the time saved waiting for a correct parse.</div></div></div></blockquote><div>Our experience with clangd is that people very often value latency over correctness when editing C++ code, and this is a situational, quantitative question.</div><div>As examples, we've failed to replace cpplint and our heuristic outline with clang-tidy and our AST-based outline. Despite being inaccurate and incomplete, users find them useful and are not willing to wait.</div><div> <br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div class="gmail_quote"><div>* I think you are better off spending your time on optimizing the correct parser infrastructure.  I'm sure more can be done - particularly in terms of caching, persisting and resusing state (think like PCH and modules etc).</div></div></div></blockquote><div>We have worked on projects over several years to improve these things (and other aspects such as error-resilience). We agree there's more that can be done, and will continue to work on this. We don't believe this approach will get anywhere near a 100x latency improvement, which is what we're looking for.</div></div></div>