[PATCH] D88814: [clangd] Enable partial namespace matches for workspace symbols

Sam McCall via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Wed Oct 7 10:23:39 PDT 2020


sammccall added inline comments.


================
Comment at: clang-tools-extra/clangd/FindSymbols.cpp:44
 
+// Returns true if \p Query can be found as a sub-scope inside \p Scope.
+bool approximateScopeMatch(llvm::StringRef Scope,
----------------
kadircet wrote:
> sammccall wrote:
> > I had a little trouble following this...
> > It seems a little simpler (fewer vars to track) if we avoid the up-front split on scopes.
> > 
> > ```
> > assert(Scope.empty() || Scope.endswith("::")); // or handle in some way
> > // Walk through Scope, consuming matching tokens from Query.
> > StringRef First;
> > while (!Scope.empty() && !Query.empty()) {
> >   tie(First, Scope) = Scope.split("::");
> >   if (Query.front() == First)
> >     Query = Query.drop_front();
> > }
> > return Query.empty(); // all components of query matched
> > ```
> > 
> > in fact we can avoid preprocessing query too:
> > 
> > ```
> > // Query is just a StringRef
> > assert(Scope.empty() || Scope.endswith("::")); // or handle in some way
> > assert(Query.empty() || Query.endswith("::"));
> > 
> > // Walk through Scope, consuming matching tokens from Query.
> > StringRef First;
> > while (!Scope.empty() && !Query.empty()) {
> >   tie(First, Scope) = Scope.split("::");
> >   Query.consume_front(StringRef(First.data(), First.size() + 2) /*Including ::*/);
> > }
> > return Query.empty(); // all components of query matched
> > ```
> Yes but they would do different things. I believe the confusion is caused by usage of `sub-scope` without a clear definition.  The codes you've suggested are performing sub-sequence matches rather than sub-string(i.e. we are looking for a contigious segment in `Scope` that matches `Query`).
> 
> I believe a query of the form `a::c::` shouldn't be matched by `a::b::c::`. I can try simplifying the logic, but it would be nice to agree on the behaviour :D.
> 
> Sorry if I miscommunicated this so far.
Ah right, I was indeed misreading the code. Let's have some definitions...

given query `a::b::Foo` with scope `a::b::`

| | a::b:: | `W::a::b::` | `a::X::b::` | `a::b::Y` |
| exact | * | | | |
| prefix |*| | | * |
| suffix | *|* | | |
| substring | * | * | | * |
| subsequence | * | * | * | * |

These support correcting different types of "errors":
 - exact: none
 - prefix: may omit namespaces immediately before Foo
 - suffix: query may be rooted anywhere (other than global ns)
 - substring: query rooted anywhere, omit namespaces before Foo
 - subsequence: may omit any component

We know "exact" is too strict.

I think "prefix" and by extension "substring" aren't particularly compelling rules as the "immediately before Foo" requirement is arbitrary.
Why does `a::b::Foo` match `a::b::c::Foo` and not `a::c::b::Foo`? In each case we've omitted a namespace inside the query, the only difference is what it's sandwiched between.

Suffix makes intuitive sense, it accepts strings that make sense *somewhere* in the codebase.
Subsequence makes intuitive sense too: you're allowed to forget uninteresting components, similar to how fuzzy-match lets you omit uninteresting words.

I'd prefer one of those and don't really mind which - I'd assumed subsequence was intended. Suffix is way easier to implement though :-)


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D88814/new/

https://reviews.llvm.org/D88814



More information about the cfe-commits mailing list