[cfe-commits] [PATCH] Introduce diagnostics into comment parsing and semantic analysis

Douglas Gregor dgregor at apple.com
Tue Jul 10 13:43:22 PDT 2012

On Jul 9, 2012, at 4:52 PM, Dmitri Gribenko <gribozavr at gmail.com> wrote:

> Hello,
> The attached patch enables comment parsing and semantic analysis to
> emit diagnostics.  A few diagnostics implemented -- see testcases.
> I created a new TableGen file for comment diagnostics,
> DiagnosticCommentKinds.td, because comment diagnostics don't logically
> fit into AST diagnostics file.  But I don't feel strongly about it.
> This also implements support for self-closing HTML tags in comment
> lexer and parser (for example, <br />).
> In order to issue precise diagnostic comment sema needs to know the
> declaration.  There is no easy way to find a decl by comment, so we
> match comments and decls in lockstep: after parsing one declgroup we
> check if we have any new, not yet attached comments.  If we do -- then
> we do the usual comment-finding process.
> It is interesting that this automatically handles trailing comments.
> We pick up not only comments that precede the declaration, but also
> comments that *follow* the declaration -- thanks to the lookahead in
> the lexer: after parsing the declgroup we've consumed the semicolon
> and looked ahead through comments.

This came out very cleanly, nice!

A couple minor comments below, but I think this patch is ready to go in.

Index: include/clang/Basic/DiagnosticGroups.td
--- include/clang/Basic/DiagnosticGroups.td	(revision 159963)
+++ include/clang/Basic/DiagnosticGroups.td	(working copy)
@@ -58,6 +58,7 @@
 def : DiagGroup<"discard-qual">;
 def : DiagGroup<"div-by-zero">;
 def Doxygen : DiagGroup<"doxygen">;
+def DoxygenPedantic : DiagGroup<"doxygen-pedantic">;
 def EmptyBody : DiagGroup<"empty-body">;
 def ExtraTokens : DiagGroup<"extra-tokens">;
Should "doxygen" and "doxygen-pedantic" be "documentation" and "documentation-pedantic"? Some of these warnings (e.g., parameter not found) apply equally to non-Doxygen documentation systems.

@@ -558,6 +581,14 @@
   static bool classof(const ParamCommandComment *) { return true; }
+  enum PassDirection {
+    In,
+    Out,
+    InOut
+  };
+  static const char *directionToString(PassDirection D);

getDirectionAsString(), perhaps? Function names should generally start with verbs.

+  const FunctionDecl *FD = dyn_cast_or_null<FunctionDecl>(ThisDecl);
+  // We already warned that this \\param is not attached to a function decl.
+  if (!FD)
+    return Command;

Can we make this work for ObjCMethodDecls, too? This can go into a follow-up patch.

As a stretch goal, one could imagine that \param might make sense for typedefs of function types, function pointer types, or block pointer types (although that's not nearly as important as ObjCMethodDecls).

+    if (OpenLineInvalid || CloseLineInvalid || OpenLine == CloseLine)
+      Diag(HOT->getLocation(), diag::warn_doc_html_open_close_mismatch)
+        << HOT->getTagName() << HCT->getTagName()
+        << HOT->getSourceRange() << HCT->getSourceRange();
+    else {
+      Diag(HOT->getLocation(), diag::warn_doc_html_open_close_mismatch)
+        << HOT->getTagName() << HCT->getTagName()
+        << HOT->getSourceRange();
+      Diag(HCT->getLocation(), diag::note_doc_html_closing_tag)
+        << HCT->getSourceRange();
+    }

And here I thought we weren't parsing HTML ;)  (I do agree that it makes sense to diagnose these, since we're here anyway. The client can handle recovery and rendering).

+  // Don't parse the comment if Doxygen diagnostics are ignored.
+  if (NumDecls != 0 && Group[0] &&
+      Diags.getDiagnosticLevel(diag::warn_doc_param_not_found,
+                               Group[0]->getLocation())
+        != DiagnosticsEngine::Ignored) {
+    // See if there are any new comments that are not attached to a decl.
+    ArrayRef<RawComment *> Comments = Context.getRawCommentList().getComments();
+    if (!Comments.empty() &&
+        !Comments.back()->isAttached()) {
+      // There is at least one comment that not attached to a decl.
+      // Maybe it should be attached to one of these decls?
+      //
+      // Note that this way we pick up not only comments that precede the
+      // declaration, but also comments that *follow* the declaration -- thanks to
+      // the lookahead in the lexer: we've consumed the semicolon and looked
+      // ahead through comments.
+      for (unsigned i = 0; i != NumDecls; ++i)
+        Context.getCommentForDecl(Group[i]);
+    }
+  }

Very nice! 

As a performance optimization, should we keep track of the last attached comment, so that we can limit the search that getCommentForDecl does to the range [last attached + 1, end()]? It seems unfortunate that the lower_bound in getRawCommentForDeclNoCache() always starts at the beginning of the list of raw comments, when we could easily tell it to search a restricted range.

Aside: should getCommentForDecl() learn that the documentation for ParmVarDecls is actually attached to the part FunctionDecl (or ObjCMethodDecl), and get the documentation from there?

	- Doug

More information about the cfe-commits mailing list