[PATCH] D98710: [clang-tidy] Add --skip-headers, part 1 (use setTraversalScope)

Chih-Hung Hsieh via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Sun Jun 27 03:01:56 PDT 2021


chh added inline comments.


================
Comment at: clang-tools-extra/clang-tidy/ClangTidy.cpp:431
+bool DeclFilter::skipLocation(SourceLocation Location) {
+  // do not skip non-file locations
+  if (!Location.isValid())
----------------
sammccall wrote:
> this comment didn't match the next line.
> Did you intend to check Location.isFile() instead?
I just wanted to make sure that we can get valid FileID.
Updated the comment.




================
Comment at: clang-tools-extra/clang-tidy/ClangTidy.cpp:434
+    return false;
+  auto FID = Sources.getDecomposedExpansionLoc(Location).first;
+  // Quick check against last checked results.
----------------
sammccall wrote:
> sammccall wrote:
> > maybe add a comment explaning why the expansion loc is chosen?
> note that this means we're going to traverse up the macro expansion edges even if we're hitting the cache. You may want to cache under the original file ID instead?
This is to match the ClangTidyDiagnosticConsumer::checkFilters.
Do you think some other get*Location method should be used in both places?



================
Comment at: clang-tools-extra/clang-tidy/ClangTidy.cpp:434
+    return false;
+  auto FID = Sources.getDecomposedExpansionLoc(Location).first;
+  // Quick check against last checked results.
----------------
chh wrote:
> sammccall wrote:
> > sammccall wrote:
> > > maybe add a comment explaning why the expansion loc is chosen?
> > note that this means we're going to traverse up the macro expansion edges even if we're hitting the cache. You may want to cache under the original file ID instead?
> This is to match the ClangTidyDiagnosticConsumer::checkFilters.
> Do you think some other get*Location method should be used in both places?
> 
I would rather explore various cache performance improvements in follow up CLs.
In my performance tests, current simple check of LastAcceptedFileID and LastSkippedFileID already has very high hit rate due to sequential check of Decls in the included header files. More caching improvements should produce relatively smaller difference, which will be shown/visible easier in their own small change CLs.



================
Comment at: clang-tools-extra/clang-tidy/ClangTidy.cpp:436
+  // Quick check against last checked results.
+  if (FID == LastSkippedFileID)
+    return true;
----------------
sammccall wrote:
> caching all results in a DenseMap<FileID, bool> seems likely to perform better, no?
I am sure most calls hit the last FileID. For the missed calls, I was planning to add other caching mechanism in skipFileID function. That could be in follow up CLs with more accurate measurements to compare the overhead vs hit rate.



================
Comment at: clang/include/clang/ASTMatchers/ASTMatchFinder.h:148
+    /// Check if a Decl should be skipped.
+    std::shared_ptr<DeclFilter> Filter;
   };
----------------
sammccall wrote:
> hokein wrote:
> > chh wrote:
> > > sammccall wrote:
> > > > I don't think the filter belongs here.
> > > > The design of traversal scope is that it's a property of the AST that affects all traversals, so it shouldn't be a configuration property of particular traversals like ASTMatchFinder.
> > > > 
> > > > There's a question of what to do about `MatchFinder::newASTConsumer()`, which runs the finder immediately after an AST is created, and so covers the point at which we might set a scope. There are several good options, e.g.:
> > > >  - Don't provide any facility for setting traversal scope when newASTConsumer() is used - it's not commonly needed, and the ASTConsumer is trivial to reimplement where that's actually needed
> > > >  - Accept an optional `function<void(ASTContext&)>` which should run just before matching starts
> > > > 
> > > > This seems a bit subtle, but the difference between having this in MatchFinder proper vs just in newASTConsumer() is important I think, precisely because it's common to run the matchers directly on an existing AST.
> > > > 
> > > I have some similar concerns too.
> > > 
> > > clangd calls MatchFinder::matchAST explicitly after setTraversalScope,
> > > but clang-tidy uses MultiplexConsumer and MatchFinder::newASTConsumer
> > > is just one of the two consumers.
> > > (1) I am not sure if it would be safe or even work to make big changes in
> > > ClangTidy.cpp's CreateASTConsumer to call MatchFinder::matchAST explicitly.
> > > (2) Similarly, I wonder if setTraversalScope will work for both MatchFinder
> > > and other consumers in the same MultiplexConsumer.
> > > 
> > > Could you check if my current change to MatchFinder::HandleTranslationUnit
> > > is right, especially in the saving/restoring of traversal scope?
> > > 
> > > I am assuming that each consumer in a MultiplexConsumer will have its own
> > > chance to HandleTranslationUnit and HandleTopLevelDecl.
> > > If that's correct, it seems to me that changing those two handlers in
> > > MatchFinder is right for clang-tidy. Then they will need the optional
> > > MatchFinder::MatchFinderOptions::DeclFilter.
> > > 
> > > 
> > > but clang-tidy uses MultiplexConsumer and MatchFinder::newASTConsumer is just one of the two consumers.
> > 
> > Yeah, `MultiplexConsumer` is a separate interface in clang, and I don't think we have a strong reason to modify it. 
> > 
> > However, we could refine the bit in clang-tidy -- clang-tidy uses `MultiplexConsumer` to dispatch all events to two consumers: the MatcherFinder, the clang's static analyzer, we can get rid of the `MultiplexConsumer` by implementing a new ASTConsumer, so that we have enough flexibility to affect all traversals without touching all clang areas, so the API will look like
> > 
> > ```
> > class ClangTidyASTConsumer : public ASTConsumer {
> > 
> > public:
> >   void HandleTopLevelDecl(...) override {
> >      // collect all main file decl
> >   }
> >   void HandleTranslationUnit(ASTContext &Context) override {
> >     // set traversal scope.
> >     MatcherFinderConsumer->HandleTranslationUnit(Context);
> >     StaticAnalyzer->HandleTranslationUnit(Context);
> >   }
> >   
> >   // implement other necessary Handle* overrides, and dispatch to StaticAnalyzer consumers;
> > 
> > private:
> >   std::unique_ptr<ASTConsumer> MatcherFinderConsumer;
> >   std::unique_ptr<...> StaticAnalyzer;
> >   ... MainFileDecl;
> > }; 
> > ``` 
> MultiplexConsumer is just used as a helper to glue together two concrete ASTConsumers, not arbitrary consumers, so we can reason about whether it's safe by reading the code.
> 
> The static analyzer's consumer appears to already track top-level decls rather than traversing from the root, exactly to avoid traversing stuff from the preamble. So I would expect setTraversalContext to work fine there.
> 
> The approach Haojian suggests of avoiding MultiplexConsumer, and making the tight coupling to the particular consumers explicit, also seems fine.
> 
> > If that's correct, it seems to me that changing those two handlers in MatchFinder is right for clang-tidy
> 
> This might be the most convenient thing for clang-tidy, but I don't think it's the right thing for MatchFinder, which is also important.
I looked into Hokein's suggestion. Although in general correctly used inheritance is better, there are special cases that copying functionality from another class could be better. I wondered if it could work better by changing ClangTidyASTConsumer to behave like  MultiplexConsumer without inheritance.

After looking into MultiplexConsumer's many methods, what are used and could be used in the future by MatchFinder's MatchASTConsumer and static analyzer's AnalysisConsumer, the dependency on MultiplexConsumer to pass through all Consumer methods is obvious. So, ClangTidyASTConsumer really needs to behave like MultiplexConsumer, duplicating all existing and future methods, to keep MatchFinder and AnalysisConsumer working. This looks like a real inheritance use case and difficult to make it better without inheritance.

If MatchFinder users really don't want an optional filter for HandleTranslationUnit and HandleTopLevelDecl, I think an alternative is to define a child class maybe called TidyMatchFinder with the extra features needed now only by clang-tidy. But, there is also advantage in adding these into MatchFinder now for future non-clang-tidy users. That will be similar to the current optional MatchFinder's CheckProfiling and the set/getTraversalScope. Those features/interfaces started with one user/tool and now we see more users/tools. If they were defined in a child class specific to clangd or some profiler, we will now need to rename or move those features to a common base class to share.

Please feel free to suggest any alternative, or let me know if you are okay with current optional filter in MatchFinder, or if you insist on defining a new child class like TidyMatchFinder.
Thanks.



CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D98710/new/

https://reviews.llvm.org/D98710



More information about the cfe-commits mailing list