[PATCH] D105904: [clangd] Support `#pragma mark` in the outline

Kadir Cetinkaya via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Tue Aug 3 02:13:25 PDT 2021


kadircet added inline comments.


================
Comment at: clang-tools-extra/clangd/FindSymbols.cpp:682
+  // here since editors won't properly render the symbol otherwise.
+  StringRef MaybeGroupName = Name;
+  if (MaybeGroupName.consume_front("-") &&
----------------
dgoldman wrote:
> kadircet wrote:
> > I think this reads easier:
> > 
> > ```
> > bool IsGroup = Name.consume_front("-");
> > Name = Name.ltrim();
> > if (Name.empty())
> >   Name = IsGroup ? "unnamed group" : ...;
> > ```
> That behavior is slightly different, we want to treat `#pragma mark -Foo` as `-Foo` non group but `#pragma mark - Foo` as `Foo` group.
oh I see. I indeed missed the `\s+` in the comment assumed it was `\s*`.
Is it really important to have that difference? If so, it might be useful to spell it out explicitly (e.g. `-Foo is not considered as a group`)


================
Comment at: clang-tools-extra/clangd/FindSymbols.cpp:535
+/// by range.
+std::vector<DocumentSymbol> mergePragmas(std::vector<DocumentSymbol> &Syms,
+                                         std::vector<PragmaMarkSymbol> &Pragmas,
----------------
dgoldman wrote:
> kadircet wrote:
> > dgoldman wrote:
> > > kadircet wrote:
> > > > dgoldman wrote:
> > > > > sammccall wrote:
> > > > > > FWIW the flow control/how we make progress seem hard to follow here to me.
> > > > > > 
> > > > > > In particular I think I'm struggling with the statefulness of "is there an open mark group".
> > > > > > 
> > > > > > Possible simplifications:
> > > > > >  - define a dummy root symbol, which seems clearer than the vector<symbols> + range
> > > > > >  - avoid reverse-sorting the list of pragma symbols, and just consume from the front of an ArrayRef instead
> > > > > >  - make the outer loop over pragmas, rather than symbols. It would first check if the pragma belongs directly here or not, and if so, loop over symbols to work out which should become children. This seems very likely to be efficient enough in practice (few pragmas, or most children are grouped into pragmas)
> > > > > > define a dummy root symbol, which seems clearer than the vector<symbols> + range
> > > > > 
> > > > > I guess? Then we'd take in a `DocumentSymbol & and a ArrayRef<PragmaMarkSymbol> & (or just by value and then return it as well). The rest would be the same though
> > > > > 
> > > > > > In particular I think I'm struggling with the statefulness of "is there an open mark group".
> > > > > 
> > > > > We need to track the current open group if there is one in order to move children to it.
> > > > > 
> > > > > > make the outer loop over pragmas, rather than symbols. It would first check if the pragma belongs directly here or not, and if so, loop over symbols to work out which should become children. This seems very likely to be efficient enough in practice (few pragmas, or most children are grouped into pragmas)
> > > > > 
> > > > > The important thing here is knowing where the pragma mark ends - if it doesn't, it actually gets all of the children. So we'd have to peak at the next pragma mark, add all symbols before it to us as children, and then potentially recurse to nest it inside of a symbol. I'll try it out and see if it's simpler.
> > > > > 
> > > > > 
> > > > ```
> > > > while(Pragmas) {
> > > > // We'll figure out where the Pragmas.front() should go.
> > > > Pragma P = Pragmas.front();
> > > > DocumentSymbol *Cur = Root;
> > > > while(Cur->contains(P)) {
> > > >   auto *OldCur = Cur;
> > > >   for(auto *C : Cur->children) {
> > > >      // We assume at most 1 child can contain the pragma (as pragmas are on a single line, and children have disjoint ranges)
> > > >      if (C->contains(P)) {
> > > >          Cur = C;
> > > >          break;
> > > >      }
> > > >   }
> > > >   // Cur is immediate parent of P
> > > >   if (OldCur == Cur) {
> > > >     // Just insert P into children if it is not a group and we are done.
> > > >     // Otherwise we need to figure out when current pragma is terminated:
> > > > // if next pragma is not contained in Cur, or is contained in one of the children, It is at the end of Cur, nest all the children that appear after P under the symbol node for P.
> > > > // Otherwise nest all the children that appear after P but before next pragma under the symbol node for P.
> > > > // Pop Pragmas and break
> > > >   }
> > > > }
> > > > }
> > > > ```
> > > > 
> > > > Does that make sense, i hope i am not missing something obvious? Complexity-wise in the worst case we'll go all the way down to a leaf once per pragma, since there will only be a handful of pragmas most of the time it shouldn't be too bad.
> > > I've implemented your suggestion. I don't think it's simpler, but LMK, maybe it can be improved.
> > oops, i was looking into an older revision and missed mergepragmas2, i think it looks quite similar to this one but we can probably get rid of the recursion as well and simplify a couple more cases
> This makes sense,  I think that works for the most part besides dropping the recursion, specifically for
> 
> ```
>       // Next pragma is contained in the Sym, it belongs there and doesn't
>       // affect us at all.
>       if (Sym.range.contains(NextPragma.DocSym.range)) {
>         Sym.children = mergePragmas2(Sym.children, Pragmas, Sym.range);
>         continue;
>       }
> ```
> 
> I guess we could explicitly forbid 3+ layers of nesting and handle it inline there? But I'm not sure it's worth the effort to rewrite all of this - the recursion shouldn't be deep and we avoid needing to shift vector elements over by recreating a new one.
Sorry I don't follow why we can't get rid of the recursion in this case.

Two loop solution I described above literally tries to find the document symbol node, such that the current pragma is contained in that node && current pragma isn't contained in any of that node's children. Afterwards it inserts the pragma into that node and starts traversing the tree from root again for the next pragma.

Again I don't follow where the `3+ layers of nesting` constraint came from. But I do feel like the iterative version is somewhat easier to reason about (especially keeping track of what's happening with `pragmas.front()` and the way it bails out via `parentrange` check). Shifting of the vector is definitely unfortunate but I think it shouldn't imply big performance hits in practice as we are only shifting the children of a single node.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D105904/new/

https://reviews.llvm.org/D105904



More information about the cfe-commits mailing list