[PATCH] D40897: [clangd] Introduce a "Symbol" class.

Haojian Wu via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Fri Dec 8 08:19:33 PST 2017


hokein added a comment.

Reorganizing the source files made all the comments invalid in the latest version :(. Feel free to comment on the old version or the new version.



================
Comment at: clangd/Symbol.h:23
+  // The path of the source file where a symbol occurs.
+  std::string FilePath;
+  // The offset to the first character of the symbol from the beginning of the
----------------
sammccall wrote:
> hokein wrote:
> > malaperle wrote:
> > > malaperle wrote:
> > > > sammccall wrote:
> > > > > ioeric wrote:
> > > > > > Is this relative or absolute?
> > > > > Having every symbol own a copy of the filepath seems wasteful.
> > > > > It seems likely that the index will have per-file information too, so this representation should likely be a key to that. Hash of the filepath might work?
> > > > How we model it is that a symbol doesn't have a "location", but its occurrence do. One could consider the location of a symbol to be either its declaration occurrence (SymbolRole::Declaration) or its definition (SymbolRole::Definition).
> > > > What we do to get the location path is each occurrence has a pointer (a "database" pointer, but it doesn't matter) to a file entry and then we get the path from the entry.
> > > > 
> > > > So conceptually, it works a bit like this (although it fetches information on disk).
> > > > ```
> > > > class IndexOccurrence {
> > > > IndexOccurrence *FilePtr;
> > > > 
> > > > std::string Occurrence::getPath() {
> > > >   return FilePtr->getPath();
> > > > }
> > > > };
> > > > ```
> > > Oops, wrong type for the field, it should have been:
> > > ```
> > > class IndexOccurrence {
> > > IndexFile *FilePtr;
> > > 
> > > std::string Occurrence::getPath() {
> > >   return FilePtr->getPath();
> > > }
> > > };
> > > ```
> > > Is this relative or absolute?
> > 
> >  Whether the file path is relative or absolute depends on the build system, the file path could be relative (for header files) or absolute (for .cc files).
> > 
> > > How we model it is that a symbol doesn't have a "location", but its occurrence do.
> > 
> > We will also have a SymbolOccurence structure alongside with Symbol (but it is not  in this patch). The "Location" will be a part of SymbolOccurrence.
> >  
> > Whether the file path is relative or absolute depends on the build system, the file path could be relative (for header files) or absolute (for .cc files).
> 
> I'm not convinced this actually works. There's multiple codepaths to the index, how can we ensure we don't end up using inconsistent paths?
> e.g. we open up a project that includes a system header using a relative path, and then open up that system header from file->open (editor knows only the absolute path) and do "find references".
> 
> I think we need to canonicalize the paths. Absolute is probably easiest.
Absolute path for .cc file is fine, I was a bit concerned about the path for .h file, especially we might use it in `#include`, but we can figure out later.  Changed to absolute file path.


================
Comment at: clangd/Symbol.h:51
+  //   * For classes, the canonical location is where they are defined.
+  SymbolLocation CanonicalLocation;
+  // Information for code completion.
----------------
sammccall wrote:
> hokein wrote:
> > ioeric wrote:
> > > For functions and classes, should we store both declaration and definition locations?
> > I think the symbol occurrences would include both declaration and definition locations. 
> > 
> > The `CanonicalLocation` is providing a fast/convenient way to find the most interested location of the symbol (e.g. for code navigation, or include the missing path for a symbol), without symbol occurrences. 
> I'd be +1 on including both a definition location (if known) and a preferred declaration location, because there's enough use cases that might want definition even if it's not the preferred declaration.
> 
> But i'm fine if we want to omit the separate definition for now. In that case, call this CanonicalDeclaration?
OK, changed it to `CanonicalDeclarationLoc`, and added a FIXME for the definition.


================
Comment at: clangd/Symbol.h:68
+// changed.
+class SymbolCollector : public index::IndexDataConsumer {
+public:
----------------
sammccall wrote:
> hokein wrote:
> > sammccall wrote:
> > > Please pull this into a separate file. Someone providing e.g. symbols from a YAML file shouldn't need to pull in AST stuff.
> > > Mabye `IndexFromAST`, which would sort nicely next to `Index`?
> > I can see various meaning of "Index" here:
> > 
> > 1. `Index` in `index::IndexDataConsumer`, which collects and contructs all symbols by traversing the AST.
> > 2. `Index` term using in clangd, specially for build index on top of these collected symbols.
> > 
> > I think we should be consistent the index for 2), and `SymbolCollector` is more descriptive here.
> `SymbolCollector` is a fine name for the type, but the file should have `Index` in the name, or we should create an `Index` subdirectory. It should be possible to understand which files are part of the index subsystem by scanning the clangd directory.
> 
> (Did you see the comment was about separating into a different file, not about renaming the class?)
Oh, sorry for that. I thought that you were meaning to rename the SymbolCollector class.

I prefer to create an subdirectory `index`, and put everything related to the index to that directory, instead of having `Index` word in the top-level clangd filenames. 


================
Comment at: clangd/Symbol.h:72
+
+  void addSymbol(Symbol S);
+  bool hasSymbol(StringRef Identifier) const;
----------------
sammccall wrote:
> This is dangerous if called after reads, as it invalidates iterators and pointers.
> I don't think we actually indend to support such mutation, so I'd suggest adding an explicit freeze() function. addSymbol() is only valid before freeze(), and reading functions are only valid after. An assert can enforce this.
> (This is a cheap version of a builder, which are more painful to write but may also be worth it).
> 
> If we choose not to enforce this at all, the requirement shold be heavily documented!
I think the only user to mutate this object is SymbolCollector.

Instead of adding `freeze` function, I made it as a private method and declare SymbolCollector as a friend class. Does it look better? 


Repository:
  rCTE Clang Tools Extra

https://reviews.llvm.org/D40897





More information about the cfe-commits mailing list