[PATCH] D39050: Add index-while-building support to Clang

Eric Liu via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Mon Dec 11 14:21:22 PST 2017


ioeric added a comment.

Thanks a lot for the changes! Some more comments inlined.

Please mark addressed comments as done so that reviewers could know what to look :) Thanks!



================
Comment at: include/clang/Frontend/CompilerInstance.h:187
+  typedef std::function<std::unique_ptr<FrontendAction>(
+      const FrontendOptions &opts, std::unique_ptr<FrontendAction> action)>
+      ActionWrapperTy;
----------------
nit: LLVM variable names start with upper-case letters.


================
Comment at: include/clang/Index/IndexingAction.h:34
   class IndexDataConsumer;
+  class IndexUnitWriter;
 
----------------
This should be removed?

Some forward declarations above are not used as well.


================
Comment at: lib/Driver/Job.cpp:293
+    if (HaveCrashVFS) {
+      IndexStoreDir = llvm::sys::path::parent_path(
+          llvm::sys::path::parent_path(CrashInfo->VFSPath));
----------------
Could you share this code with line 278 above, which already has a nice comment? 


================
Comment at: lib/Index/FileIndexRecord.cpp:39
+  auto It = std::upper_bound(Decls.begin(), Decls.end(), NewInfo);
+  Decls.insert(It, std::move(NewInfo));
+}
----------------
nathawes wrote:
> ioeric wrote:
> > Why do we need `Decls` to be sorted by offset? If we want this for printing, it might make sense to just do a sort there.
> It's mostly for when we hash them, so that ordering doesn't change the hash, but it's also for printing. The IndexASTConsumer doesn't always report symbol occurrences in source order, due to the preprocessor and a few other cases.
> We can sort them when the IndexRecordDataConsumer's finish() is called rather than as they're added to avoid the copying from repeated insert calls if that's the concern.
I would leave the sorting to the point where records are hashed to avoid making the record stateful. Consider changing `getDeclOccurrences` to `getOccurrencesSortedByOffset`; this should make the behavior more explicit.


================
Comment at: lib/Index/FileIndexRecord.h:51
+public:
+  FileIndexRecord(FileID FID, bool isSystem) : FID(FID), IsSystem(isSystem) {}
+
----------------
s/isSystem/IsSystem/

Also, I wonder if we can filter out system decls proactively and avoid creating file index record for them. We could also avoid propogating `IsSystem` here.


================
Comment at: lib/Index/IndexingAction.cpp:504
+
+    CreatedASTConsumer = true;
+    std::vector<std::unique_ptr<ASTConsumer>> Consumers;
----------------
nathawes wrote:
> ioeric wrote:
> >  Can we get this state from the base class instead of maintaining a another state, which seems to be identical?
> I don't see this state in either base class (WrapperFrontendAction and IndexRecordActionBase). WrappingIndexAction and WrappingIndexRecordAction both have this, though. Were you thinking a new intermediate common base class between them and WrapperFrontendAction?
I thought this could be a state in the `WrapperFrontendAction` since both derived classes maintain this state, but after a closer look, this seems to depend on both base classes. I'm not a big fun of maintaining states in multi-stage classes (e.g. `FrontendAction`), which could be confusing and hard to follow; I think `IndexRecordActionBase::finish(...)` should be able to handle the case where no index consumer is created (i.e. no record/dependency/... is collected).

Also, `IndexRecordActionBase` (and the existing `IndexActionBase `) should really be a component instead of a base class since none of its methods is `virtual`.


================
Comment at: lib/Index/IndexingAction.cpp:370
+public:
+  SourceFilesIndexDependencyCollector(IsSystemFileCache &SysrootPath,
+                                      RecordingOptions recordOpts)
----------------
`IsSystemFileCache &SysrootPath`? What is this parameter?


================
Comment at: lib/Index/IndexingAction.cpp:459
+
+class IndexRecordActionBase {
+protected:
----------------
Please document this class. This can be easily confused with `IndexActionBase` which has a similar name. Same for `IndexAction`/`IndexRecordAction` and `WrappingIndexRecordAction`/`WrappingIndexRecordAction`.  I think these pairs share (especially the wrapping actions) some common logics and could probably be merged. 


================
Comment at: lib/Index/IndexingAction.cpp:485
+
+  void finish(CompilerInstance &CI);
+};
----------------
This does a lot of stuff... please document the behavior!


================
Comment at: lib/Index/IndexingAction.cpp:577
+  if (!isModuleGeneration &&
+      CI.getFrontendOpts().ProgramAction != frontend::GeneratePCH) {
+    RootFile = SM.getFileEntryForID(SM.getMainFileID());
----------------
nit: no need for braces. Same below.


================
Comment at: lib/Index/IndexingAction.cpp:589
+
+static void writeUnitData(const CompilerInstance &CI,
+                          IndexDataRecorder &Recorder,
----------------
In the previous patch, `writeUnitData` does several things including handling modules, dependencies, includes and index records, as well as writing data. It might make sense to add an abstract class (`UnitDataCollector`?) that defines interfaces which make these behavior more explicit. We can then have users pass in an implementation via `createIndexDataRecordingAction` which would also decouple the data collection from data storage in the library.


================
Comment at: lib/Index/IndexingAction.cpp:624
+
+std::unique_ptr<FrontendAction> index::createIndexDataRecordingAction(
+    const FrontendOptions &FEOpts,
----------------
I'm a bit nervous about propagating the entire `FrontendOptions` into the index library. I would simply expose `getIndexOptionsFromFrontendOptions` and have callers parse `FrontendOptions` and pass in only index-related options.


================
Comment at: lib/Index/IndexingContext.h:41
+/// file is considered a system file or not
+class IsSystemFileCache {
+  std::string SysrootPath;
----------------
This name is really confusing... `Is*` is usually used for booleans. Simply call this `SystemFileCache`.


================
Comment at: lib/Index/IndexingContext.h:53
+
+  void setSysrootPath(StringRef path);
+  StringRef getSysrootPath() const { return SysrootPath; }
----------------
How does this affect the existing cached results? Do you need to invalidate them?


================
Comment at: lib/Index/IndexingContext.h:63
   IndexDataConsumer &DataConsumer;
+  IsSystemFileCache &IsSystemCache;
   ASTContext *Ctx = nullptr;
----------------
I think it would be more straightforward to have context own the cache. If `setSysrootPath` is the problem, it might make sense to propagate it via the context or, if necessary, create a new cache when a new `SysrootPath` is set.


https://reviews.llvm.org/D39050





More information about the cfe-commits mailing list