[cfe-dev] Unbundling key parts of libIndex/IndexingContext

Alex L via cfe-dev cfe-dev at lists.llvm.org
Fri Aug 9 17:55:20 PDT 2019


Hi Sam,

I think it would be great if the current implementation was modularized in
a way that would allow Clang and related tools to remove recursive tree
visitors that are either abused for other purposes, or just duplicate work
that some other visitors are doing, either partially or fully.

I think starting out with a new implementation is not a bad approach. That
said, I think it would be really good if the new implementation it was
adopted by libIndex from the start, as this would allow:
- A clear path towards convergence of the new implementation and the old
implementation
- More coverage for testing
The primary risk of not integrating this into libIndex directly is the high
cost of transition to a unified implementation at a later stage by other
adopters. Also, there's a real downside of not unifying code from the start
(which we know too well unfortunately :( ): the general build and code size
cost of additional tree visitor code in Clang and its tools.

I think while it's desirable to change implementation in an incremental way
from the perspective of reviews and code history, sometimes it's not
entirely practical as it's actually harder to see the full picture, and to
qualify the change by doing additional testing by other adopters. I think
for a change like this a patch that entirely rewrites a lot of the code in
libIndex is acceptable. So my suggestion would be to:
- Work on unified implementation with libIndex from the start
- Migrate libIndex in one patch
- The reviewers or you might find ways to split up the patch into smaller,
incremental changes during the review
- The reviewers will be able to test one patch only

Thanks,
Alex








On Fri, 9 Aug 2019 at 10:26, Sam McCall <sammccall at google.com> wrote:

> TL;DR: I'd really like to extract the part that goes from AST node ->
> Decl + Role, but I don't know how to do it incrementally. Current plan
> would be to copy it to something parallel, and maybe try to port libIndex
> later. That sounds likely to end in permanent duplication though. Help!
>
> --
>
> Hi tooling folks,
>
> There are several pieces in libIndex that are important for indexing, but
> also useful for other things. clangd (and I would guess libclang) has been
> (ab)using these, e.g. implementing "go to definition" by indexing the file
> and looking for decl references that overlap the cursor.
> But performance, flexibility, and understanding tend to suffer, as the
> APIs don't always suit the problem space.
>
> I think IndexingAction/IndexingContext do the following things:
> A) expose macros while parsing/recorded in PP through a common interface
> B) traverse the AST (according to some policy) to find AST nodes
> C) determine which tokens in the source refer to those nodes
> D) determine which decls AST nodes refer to, and with which roles
> E) find decls related to an AST node (e.g. overrides)
> There's also F) getSymbolInfo(), which is already split out.
>
> These seem mostly[1] disjoint to me, and can each be replaced/dropped in
> some scenarios. examples:
> - finding AST nodes (B) is sometimes better done with
> ASTSelection/SelectionTree/RecursiveASTVisitor.
> - hover-to-evaluate-constant needs C but not D
> - analyses like extract-variable care about D but not C or E
>
> AST Node -> Decl is the largest and most useful chunk. I've built a
> standalone prototype which does pretty well on C++, and porting more of the
> logic from libIndex could bring it up to par.
> The API is something like
>   vector<pair<const Decl*, SymbolRoleSet>> findDecl(DynTypedNode,
>      ReferencePolicy, // lexical references only, or semantic ones like
> `decltype(...)` too?
>      AliasPolicy, // for typedefs etc, return the alias, the underlying
> decl, or both?
>      TemplatePolicy, // return specialization, the relevant pattern, the
> primary template, or some combination?
>   )
> The Policy options allow us to more directly express features and fix some
> bugs.
>
> My main concern is that I'm not sure how to eventually (ideally
> incrementally!) use this in libIndex.
> The natural implementation is a set of mutually recursive functions to
> handle Decl/Type/Stmt etc (much like libIndex's structure), so before
> calling any of them, all must be complete and correct.
> And SymbolRelations seem to sometimes need information from the AST
> traversal, and sometimes information from the node->decl walk, so it's hard
> to see how they can be separated.
>
> So though it seems insane to have two copies of this logic, I find myself
> leaning toward a separate implementation. Is there a better way out than
> this? How much value is there in putting it in Tooling vs locally in clangd?
>
> Cheers, Sam
>
> [1] B and E seem fairly tangled, actually.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20190809/414f35d1/attachment.html>


More information about the cfe-dev mailing list