<html><head><meta http-equiv="Content-Type" content="text/html; charset=utf-8"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;" class="">Hi, <div class=""><br class=""></div><div class=""><blockquote type="cite" class=""><div class="">On Nov 5, 2021, at 6:37 AM, Haojian Wu via cfe-dev <<a href="mailto:cfe-dev@lists.llvm.org" class="">cfe-dev@lists.llvm.org</a>> wrote:</div><br class="Apple-interchange-newline"><div class=""><span class="" style="float: none; display: inline !important;">We’d like to propose a pseudo-parser which can approximately parse C++ (including broken code). It parses a file in isolation, without needing headers, compile flags etc. Ambiguities are resolved heuristically, like clang-format. Its output is a clang::syntax tree, which maps the token sequence onto the C++ grammar.</span><br class=""><span class="" style="float: none; display: inline !important;">Our motivation comes from wanting to add some low latency features (file outline, refactorings etc) in clangd, but we think this is a useful building block for other tools too.</span></div></blockquote><div class=""><div class=""><span class="" style="float: none; display: inline !important;"><br class=""></span></div></div><div class=""><span class="" style="float: none; display: inline !important;">This is quite interesting and exciting!</span></div><div><br class=""><blockquote type="cite" class=""><div class="">On Nov 9, 2021, at 6:34 AM, Sam McCall via cfe-dev <<a href="mailto:cfe-dev@lists.llvm.org" class="">cfe-dev@lists.llvm.org</a>> wrote:</div><br class="Apple-interchange-newline"><div class=""><span style="caret-color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none; float: none; display: inline !important;" class="">The idea for clangd is to augment (not replace) this index with a pseudo-parser based index that processes each file once. It would be halfway between the AST index and grep. This index would provide the same operations with lower fidelity, and would be replaced by the AST-based index as it completes.</span></div></blockquote></div><br class=""></div><div class="">Could you provide more details about this part, how will both sources be combined together?</div><div class=""><br class=""></div><div class="">E.g. is clangd always using the pseudo-index only until the AST one completes completely, at which points it switches to the AST one (so either one or the other exclusively), or does it mix the info from both? Are the data structures and storage different and separate (and with separate APIs?) or are the indexing info representation and storage shared and they mainly differ in how they derive the indexing information?</div><div class=""><br class=""></div></body></html>