[PATCH] D88814: [clangd] Enable partial namespace matches for workspace symbols
Sam McCall via Phabricator via cfe-commits
cfe-commits at lists.llvm.org
Wed Oct 7 10:23:39 PDT 2020
sammccall added inline comments.
================
Comment at: clang-tools-extra/clangd/FindSymbols.cpp:44
+// Returns true if \p Query can be found as a sub-scope inside \p Scope.
+bool approximateScopeMatch(llvm::StringRef Scope,
----------------
kadircet wrote:
> sammccall wrote:
> > I had a little trouble following this...
> > It seems a little simpler (fewer vars to track) if we avoid the up-front split on scopes.
> >
> > ```
> > assert(Scope.empty() || Scope.endswith("::")); // or handle in some way
> > // Walk through Scope, consuming matching tokens from Query.
> > StringRef First;
> > while (!Scope.empty() && !Query.empty()) {
> > tie(First, Scope) = Scope.split("::");
> > if (Query.front() == First)
> > Query = Query.drop_front();
> > }
> > return Query.empty(); // all components of query matched
> > ```
> >
> > in fact we can avoid preprocessing query too:
> >
> > ```
> > // Query is just a StringRef
> > assert(Scope.empty() || Scope.endswith("::")); // or handle in some way
> > assert(Query.empty() || Query.endswith("::"));
> >
> > // Walk through Scope, consuming matching tokens from Query.
> > StringRef First;
> > while (!Scope.empty() && !Query.empty()) {
> > tie(First, Scope) = Scope.split("::");
> > Query.consume_front(StringRef(First.data(), First.size() + 2) /*Including ::*/);
> > }
> > return Query.empty(); // all components of query matched
> > ```
> Yes but they would do different things. I believe the confusion is caused by usage of `sub-scope` without a clear definition. The codes you've suggested are performing sub-sequence matches rather than sub-string(i.e. we are looking for a contigious segment in `Scope` that matches `Query`).
>
> I believe a query of the form `a::c::` shouldn't be matched by `a::b::c::`. I can try simplifying the logic, but it would be nice to agree on the behaviour :D.
>
> Sorry if I miscommunicated this so far.
Ah right, I was indeed misreading the code. Let's have some definitions...
given query `a::b::Foo` with scope `a::b::`
| | a::b:: | `W::a::b::` | `a::X::b::` | `a::b::Y` |
| exact | * | | | |
| prefix |*| | | * |
| suffix | *|* | | |
| substring | * | * | | * |
| subsequence | * | * | * | * |
These support correcting different types of "errors":
- exact: none
- prefix: may omit namespaces immediately before Foo
- suffix: query may be rooted anywhere (other than global ns)
- substring: query rooted anywhere, omit namespaces before Foo
- subsequence: may omit any component
We know "exact" is too strict.
I think "prefix" and by extension "substring" aren't particularly compelling rules as the "immediately before Foo" requirement is arbitrary.
Why does `a::b::Foo` match `a::b::c::Foo` and not `a::c::b::Foo`? In each case we've omitted a namespace inside the query, the only difference is what it's sandwiched between.
Suffix makes intuitive sense, it accepts strings that make sense *somewhere* in the codebase.
Subsequence makes intuitive sense too: you're allowed to forget uninteresting components, similar to how fuzzy-match lets you omit uninteresting words.
I'd prefer one of those and don't really mind which - I'd assumed subsequence was intended. Suffix is way easier to implement though :-)
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D88814/new/
https://reviews.llvm.org/D88814
More information about the cfe-commits
mailing list