[PATCH] D39050: Add index-while-building support to Clang

Nathan Hawes via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Mon Nov 27 14:41:00 PST 2017


nathawes added a comment.

Thanks for the feedback @malaperle!



================
Comment at: include/indexstore/IndexStoreCXX.h:84
+
+  std::pair<unsigned, unsigned> getLineCol() {
+    unsigned line, col;
----------------
malaperle wrote:
> From what I understand, this returns the beginning of the occurrence. It would be useful to also have the end of the occurrence. From what I tested in Xcode, when you do "Find Selected Symbol in Workspace", it highlights the symbol name in yellow in the list of matches, so it mush use that LineCol then highlight the matching name. This is works in many situations but others occurrences won't have the name of the symbol. For example:
> "MyClass o1, o2;"
> If I use "Find Selected Symbol in Workspace" on MyClass constructor, if won't be able to highlight o1 and o2
> Do you think it would be possible to add that (EndLineCol)? If not, how would one go about extending libindexstore in order to add additional information per occurrence? It is not obvious to me so far.
> We also need other things, for example in the case of definitions, we need the full range of the definition so that users can "peek" at definitions in a pop-up. Without storing the end of the definition in the index, we would need to reparse the file.
> 
Our approach to related locations (e.g. name, signature, body, and doc comment start/end locs) has been to not include them in the index and derive them from the start location later. There's less data to collect and write out during the build that way, and deriving the other locations isn't that costly usually, as in most cases you 1) don't need to type check or even preprocess to get the related locations, as with finding the end of o1 and o2 in your example, and 2) usually only need to derive locations for a single or small number of occurrences, like when 'peeking' at a definition. 

Are there cases where you think this approach won't work/perform well enough for the indexer-backed queries clangd needs to support?



================
Comment at: include/indexstore/IndexStoreCXX.h:374
+
+  enum class DependencyKind {
+    Unit,
----------------
malaperle wrote:
> As part of this dependency tracking mechanism, I haven't found that it could provide information about about the files including a specific header. So given a header (or source file in odd cases), I do not see a way to get all the files that would need to be reindexed if that header changed. Is that something you achieve outside the index? Or perhaps this is something I missed in the code.
The unit files store the path of the header/source files they depend on as 'File' dependencies. So any unit file with 'File' dependency on header/source file that was modified may need to be re-indexed.

To support finding which specific files include or are included by a given header (rather than which units somehow transitively include it) we also store the file-to-file inclusions in the unit file (retrieved via IndexUnitReader's foreachInclude method below).


================
Comment at: include/indexstore/IndexStoreCXX.h:377
+    Record,
+    File,
+  };
----------------
malaperle wrote:
> Could there be a bit of explanation about what's a File dependency versus record and unit? All units and records are file dependencies, right? Are there any files that are neither records or units?
I'll rename this to SourceFile and add some comments to explain. Unit file dependencies separate the source dependencies into 'File' dependencies and 'Record' dependencies. The 'File' dependencies track the paths of the header/source files seen in the translation unit, while the 'Record' dependencies track which record files have the symbolic content seen in those source files – the header/source file path doesn't appear anywhere in the record file. This separation lets us depend on a single record file corresponding to multiple source files (e.g. when two source files have the same symbolic content), and on a single source file corresponding multiple record files (e.g. when a single header is included multiple times with different preprocessor contexts changing its symbolic content).


https://reviews.llvm.org/D39050





More information about the cfe-commits mailing list