[cfe-commits] [PATCH] Structured comment parsing, retaining comments in AST

Douglas Gregor dgregor at apple.com
Mon Jun 11 08:39:34 PDT 2012


On Jun 8, 2012, at 1:46 PM, Dmitri Gribenko <gribozavr at gmail.com> wrote:

> Hello,
> 
> I'm working on getting clang parse Doxygen comments and expose them in
> AST, via libclang APIs etc.
> 
> The first step is to save comments during parsing.  Most of this work
> was already done by Doug Gregor, [1] but it was reverted because it
> was not used. [2]  I modified this patch so that it applies to ToT.
> 
> The patch exposes raw comment text via libclang so that the feature
> can be tested with c-index-test.

A few more minor comments on the patch…

+// Declarations without Doxygen comments should not pick up some Doxygen comments.
+// RUN: grep FunctionDecl=notdoxy %t/out.c-index | grep 'Comment=' | count 0
+
+// Non-Doxygen comments should not be attached to anything.
+// RUN: grep 'NOT_DOXYGEN' %t/out.c-index | count 0
+
+// Some Doxygen comments are not attached to anything.
+// RUN: grep 'IS_DOXYGEN_NOT_ATTACHED' %t/out.c-index | count 0
+
+// Single Doxygen comments should be attached to a FunctionDecl.
+// RUN: grep FunctionDecl=isdoxy %t/out.c-index | grep 'IS_DOXYGEN_SINGLE' | count 10
+

Please use FileCheck rather than grep.

+  // The iterator range [FirstComment, Comment] contains all of the
+  // BCPL comments that, together, are associated with this declaration.
+  // Form a single comment block string for this declaration that concatenates
+  // all of these comments.
+  std::string &Result = DeclComments[D];
+  while (FirstComment != Comment) {
+    std::pair<FileID, unsigned> DecompStart
+      = SourceMgr.getDecomposedLoc(FirstComment->getBegin());
+    std::pair<FileID, unsigned> DecompEnd
+      = SourceMgr.getDecomposedLoc(Comment->getEnd());
+    Result.append(FileBufferStart + DecompStart.second,
+                  FileBufferStart + DecompEnd.second + 1);
+    ++FirstComment;
+  }
+
+  // Append the last comment line.
+  Result.append(FileBufferStart +
+                  SourceMgr.getFileOffset(Comment->getBegin()),
+                FileBufferStart + CommentEndDecomp.second + 1);

Might SmallString<256> be a better choice here than std::string, since many comments are likely to be short?

+  // Find the comment that occurs just after this declaration.
+  std::vector<SourceRange>::iterator Comment
+      = std::lower_bound(CommentSourceRanges.begin(),
+                         CommentSourceRanges.end(),
+                         SourceRange(DeclLoc),
+                         BeforeInTranslationUnit(&SourceMgr));
// …
+  if (Invalid)
+    return NULL;
// ...
+      std::string &Result = DeclComments[D];

Should we insert an empty string into DeclComments early on, before we do the work of performing lower_bound, so that repeated queries for the comment string of a declaration that does *not* have a comment don't keep performing lower_bound calls? In other words, should we cache the negative case as well as the positive case?

	- Doug





More information about the cfe-commits mailing list