[PATCH] D66572: [analyzer] NFC: BugReporter Separation Ep.I.

Dmitri Gribenko via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Mon Sep 2 04:56:55 PDT 2019


gribozavr added a comment.

Thank you for the conversation so far! This is not a complete review from me, but I'm trying to avoid branching in too many directions at once.



================
Comment at: clang/include/clang/StaticAnalyzer/Core/BugReporter/BugReporter.h:122
+  /// Get the location on which the report should be uniqued.
+  virtual PathDiagnosticLocation getUniqueingLocation() const {
+    return Location;
----------------
NoQ wrote:
> gribozavr wrote:
> > Where can I read about the uniqueing logic? Does it mean that out of two bug reports with the same location only one gets displayed, regardless of other properties?
> Added some comments ^^
Thanks for the comments, but sorry, this one is still unclear.

(1)

```
// Imagine a ClangTidy checker that finds calls to math functions where it is 

double foo() {
  return sqrt(-1); // warning: 'sqrt' will always return NaN
}
```

What should be the "decl with issue" here? 'foo' or 'sqrt'?

(2)

```
#include <vector>
using std::vector; // warning: using names from 'std' is error-prone
```

(3)

```
void foo(Foo f) {
  bar(std::move(f)); // warning: passing result of 'std::move' as a const reference argument, no move will actually happen
}
```

Is it 'foo', 'bar', 'std::move', or the copy constructor of 'Foo'?


================
Comment at: clang/include/clang/StaticAnalyzer/Core/BugReporter/BugReporter.h:126-135
+  /// Get the declaration containing the uniqueing location.
+  virtual const Decl *getUniqueingDecl() const {
+    return DeclWithIssue;
+  }
+
+  /// Return the canonical declaration, be it a method or class, where
+  /// this issue semantically occurred.
----------------
NoQ wrote:
> gribozavr wrote:
> > I don't think `getUniqueingDecl()` and `getDeclWithIssue()` make sense for most ClangTidy checkers.
> `getDeclWithIssue()` corresponds to the "Function/Method" column we have in our scan-build index page:
> 
> {F9881207}
> 
> This column is not super important but probably kinda nice to have. I don't mind leaving it empty for clang-tidy checks, or possibly auto-detect it post factum when possible (e.g., find the smallest decl in which the warning location is contained).
> 
> `getUniqueingDecl()` is, as far as i understand, only currently used by the issue hash mechanism which governs deduplication across translation units (e.g., we emit a warning against the same header in multiple translation units - it's vital for the static analyzer because it's not uncommon for an execution path that starts in the main file to end in a header). And even then, it's only present in plist output. I'm not convinced it's useful at all. It's likely that we can remove it.
> 
> Also clang-tidy checks wouldn't need to specify their `UniqueingDecl` manually as it silently defaults to their `DeclWithIssue`.
> 
> So basically we have these two methods on the base class for bureaucratic reasons: our existing algorithms handle all kinds of warnings uniformly based on these computed properties of bug reports. I think we should be perfectly fine if some kinds of bug reports return null from them, indicating that "this property doesn't make sense for them". We could instead reinvent protocols on top of our custom RTTI ("does this warning conform to `Uniqueable`? if so, i'll try to unique it"), but i don't think it's worth the complexity.
> `getDeclWithIssue()` corresponds to the "Function/Method" column we have in our scan-build index page:

Judging from the screenshot, that column seems to play a function of a human-readable summary of the warning location. Is that correct?

Should we determine it automatically from `getLocation()`? Is there a use case for `getDeclWithIssue()` be different from the function/class that `getLocation()` is pointing into?

> (e.g., find the smallest decl in which the warning location is contained).

Oh you are saying the same thing. Why not have it always be auto-detected? Is there a use case for manual curation?

> `getUniqueingDecl()` [...] I'm not convinced it's useful at all. It's likely that we can remove it.

I agree about eliminating it -- actually, replacing it with comparing source locations. In our experience running ClangTidy, uniquing bug reports based on source locations works quite well. Can you give it a go in a separate patch that we can land ahead of time? I see there are only 4 callers of the BugReport constructor that accepts the location.

Also the doc comment is somewhat contradictory. "declarations that ... contains ... the uniqueing location. This is not actively used for uniqueing..."


================
Comment at: clang/include/clang/StaticAnalyzer/Core/BugReporter/BugReporter.h:186
+  /// ranges.
+  void addRange(SourceRange R) {
+    assert((R.isValid() || Ranges.empty()) && "Invalid range can only be used "
----------------
NoQ wrote:
> gribozavr wrote:
> > Ranges should be associated with a message.
> Mmm, what do you mean?
> 
> Currently these ranges are attached to the warning message, path note messages can't have ranges, and extra path-insensitive notes can have a separate set of ranges attached to them by passing them through `addNote()`.
I see. What looks weird to me is that methods related to the warning itself are on `BugReport`, but notes and fixits are their own data structures. It creates an inconsistent API, and makes notes and fixits feel bolted on.

Do you think it would make sense to change the API to be more uniform?


================
Comment at: clang/include/clang/StaticAnalyzer/Core/BugReporter/BugReporter.h:78
   using ranges_iterator = const SourceRange *;
-  using VisitorList = SmallVector<std::unique_ptr<BugReporterVisitor>, 8>;
-  using visitor_iterator = VisitorList::iterator;
-  using visitor_range = llvm::iterator_range<visitor_iterator>;
   using NoteList = SmallVector<std::shared_ptr<PathDiagnosticNotePiece>, 4>;
 
----------------
Make NoteList private, or eliminate completely?


================
Comment at: clang/include/clang/StaticAnalyzer/Core/BugReporter/BugReporter.h:98
+
+  BugReport(Kind kind, const BugType &bt, StringRef shortDesc, StringRef desc)
+      : K(kind), BT(bt), ShortDescription(shortDesc), Description(desc) {}
----------------
I think using identical names for constructor parameters and member variables is idiomatic in LLVM.


================
Comment at: clang/include/clang/StaticAnalyzer/Core/BugReporter/BugReporter.h:108
+
+  StringRef getDescription() const { return Description; }
+
----------------
What's the difference between description and short description?

What are the expected grammatical conventions? For example, Clang and ClangTidy warning are sentence fragments that start with a lowercase letter and don't end with a period.


================
Comment at: clang/include/clang/StaticAnalyzer/Core/BugReporter/BugReporter.h:116
+
+  /// This location is where the clients that render diagnostics will display
+  /// the warning description. The warning description is supposed to describe
----------------
The primary location of the bug report that points at the undesirable behavior in the code.

UIs should attach the warning description to this location. The warning description should describe bad behavior at this location.


================
Comment at: clang/include/clang/StaticAnalyzer/Core/BugReporter/BugReporter.h:142
+
+  /// Add new item to the list of additional notes that need to be attached to
+  /// this path-insensitive report. If you want to add extra notes to a
----------------
"Add a new note at the end of this path-insensitive report."


================
Comment at: clang/include/clang/StaticAnalyzer/Core/BugReporter/BugReporter.h:147
+  /// the extra note should appear.
+  void addNote(StringRef Msg, const PathDiagnosticLocation &Pos,
+               ArrayRef<SourceRange> Ranges = {}) {
----------------
I'm concerned that `PathDiagnosticLocation` is too complex of an abstraction for ClangTidy. It might even be too complex for Static Analyzer checkers. It is OK for the Static Analyzer core though.

Take a look at users of `addNote` -- they use helpers like `makeLocation`. I don't think that would scale to the number of checkers that we have. The API should be simpler.


================
Comment at: clang/include/clang/StaticAnalyzer/Core/BugReporter/BugReporter.h:157
+
+  const NoteList &getNotes() {
+    return Notes;
----------------
ArrayRef?


================
Comment at: clang/include/clang/StaticAnalyzer/Core/BugReporter/BugReporter.h:175
+  /// Get the SourceRanges associated with the report.
+  virtual llvm::iterator_range<ranges_iterator> getRanges() const {
+    return llvm::make_range(Ranges.begin(), Ranges.end());
----------------
s/iterator_range/ArrayRef/?


================
Comment at: clang/include/clang/StaticAnalyzer/Core/BugReporter/BugReporter.h:210
+  PathDiagnosticLocation getLocation(const SourceManager &SM) const override {
+    return getUniqueingLocation();
+  }
----------------
I think the delegation should be done in the opposite way -- `getLocation` should be the source of truth, and `getUniqueingLocation` should call `getLocation`.


================
Comment at: clang/include/clang/StaticAnalyzer/Core/BugReporter/BugReporter.h:440
   visitor_iterator visitor_end() { return Callbacks.end(); }
   visitor_range visitors() { return {visitor_begin(), visitor_end()}; }
 
----------------
Could it just be `ArrayRef<...> visitors()`, without exposing extra typedefs?


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D66572/new/

https://reviews.llvm.org/D66572





More information about the cfe-commits mailing list