[cfe-commits] [PATCH] Structured comment parsing, retaining comments in AST
Douglas Gregor
dgregor at apple.com
Mon Jun 11 08:39:34 PDT 2012
On Jun 8, 2012, at 1:46 PM, Dmitri Gribenko <gribozavr at gmail.com> wrote:
> Hello,
>
> I'm working on getting clang parse Doxygen comments and expose them in
> AST, via libclang APIs etc.
>
> The first step is to save comments during parsing. Most of this work
> was already done by Doug Gregor, [1] but it was reverted because it
> was not used. [2] I modified this patch so that it applies to ToT.
>
> The patch exposes raw comment text via libclang so that the feature
> can be tested with c-index-test.
A few more minor comments on the patch…
+// Declarations without Doxygen comments should not pick up some Doxygen comments.
+// RUN: grep FunctionDecl=notdoxy %t/out.c-index | grep 'Comment=' | count 0
+
+// Non-Doxygen comments should not be attached to anything.
+// RUN: grep 'NOT_DOXYGEN' %t/out.c-index | count 0
+
+// Some Doxygen comments are not attached to anything.
+// RUN: grep 'IS_DOXYGEN_NOT_ATTACHED' %t/out.c-index | count 0
+
+// Single Doxygen comments should be attached to a FunctionDecl.
+// RUN: grep FunctionDecl=isdoxy %t/out.c-index | grep 'IS_DOXYGEN_SINGLE' | count 10
+
Please use FileCheck rather than grep.
+ // The iterator range [FirstComment, Comment] contains all of the
+ // BCPL comments that, together, are associated with this declaration.
+ // Form a single comment block string for this declaration that concatenates
+ // all of these comments.
+ std::string &Result = DeclComments[D];
+ while (FirstComment != Comment) {
+ std::pair<FileID, unsigned> DecompStart
+ = SourceMgr.getDecomposedLoc(FirstComment->getBegin());
+ std::pair<FileID, unsigned> DecompEnd
+ = SourceMgr.getDecomposedLoc(Comment->getEnd());
+ Result.append(FileBufferStart + DecompStart.second,
+ FileBufferStart + DecompEnd.second + 1);
+ ++FirstComment;
+ }
+
+ // Append the last comment line.
+ Result.append(FileBufferStart +
+ SourceMgr.getFileOffset(Comment->getBegin()),
+ FileBufferStart + CommentEndDecomp.second + 1);
Might SmallString<256> be a better choice here than std::string, since many comments are likely to be short?
+ // Find the comment that occurs just after this declaration.
+ std::vector<SourceRange>::iterator Comment
+ = std::lower_bound(CommentSourceRanges.begin(),
+ CommentSourceRanges.end(),
+ SourceRange(DeclLoc),
+ BeforeInTranslationUnit(&SourceMgr));
// …
+ if (Invalid)
+ return NULL;
// ...
+ std::string &Result = DeclComments[D];
Should we insert an empty string into DeclComments early on, before we do the work of performing lower_bound, so that repeated queries for the comment string of a declaration that does *not* have a comment don't keep performing lower_bound calls? In other words, should we cache the negative case as well as the positive case?
- Doug
More information about the cfe-commits
mailing list