<div dir="ltr">I should have fixed it in r332576.</div><div class="gmail_extra"><br><div class="gmail_quote">On Wed, May 16, 2018 at 11:49 PM, Galina Kistanova via cfe-commits <span dir="ltr"><<a href="mailto:cfe-commits@lists.llvm.org" target="_blank">cfe-commits@lists.llvm.org</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div><div>Also few other builders are affected:<br><br><a href="http://lab.llvm.org:8011/builders/clang-x86_64-linux-abi-test" target="_blank">http://lab.llvm.org:8011/<wbr>builders/clang-x86_64-linux-<wbr>abi-test</a><br><a href="http://lab.llvm.org:8011/builders/clang-lld-x86_64-2stage" target="_blank">http://lab.llvm.org:8011/<wbr>builders/clang-lld-x86_64-<wbr>2stage</a><br><a href="http://lab.llvm.org:8011/builders/clang-with-lto-ubuntu" target="_blank">http://lab.llvm.org:8011/<wbr>builders/clang-with-lto-ubuntu</a><br><br><br></div>Thanks<br><br></div>Galina<br></div><div class="gmail-HOEnZb"><div class="gmail-h5"><div class="gmail_extra"><br><div class="gmail_quote">On Wed, May 16, 2018 at 12:58 PM, Galina Kistanova <span dir="ltr"><<a href="mailto:gkistanova@gmail.com" target="_blank">gkistanova@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr">Hello Ilya,<br><br>This commit broke build step for couple of our builders:<br><br><a href="http://lab.llvm.org:8011/builders/clang-with-lto-ubuntu/builds/8541" target="_blank">http://lab.llvm.org:8011/build<wbr>ers/clang-with-lto-ubuntu/<wbr>builds/8541</a><br><a href="http://lab.llvm.org:8011/builders/clang-with-thin-lto-ubuntu" target="_blank">http://lab.llvm.org:8011/build<wbr>ers/clang-with-thin-lto-ubuntu</a><br><br>. . .<br>FAILED: tools/clang/unittests/AST/CMak<wbr>eFiles/ASTTests.dir/CommentTex<wbr>tTest.cpp.o <br>/usr/bin/c++ -DGTEST_HAS_RTTI=0 -DGTEST_HAS_TR1_TUPLE=0 -DGTEST_LANG_CXX11=1 -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -Itools/clang/unittests/AST -I/home/buildslave/buildslave1<wbr>a/clang-with-lto-ubuntu/llvm.<wbr>src/tools/clang/unittests/AST -I/home/buildslave/buildslave1<wbr>a/clang-with-lto-ubuntu/llvm.<wbr>src/tools/clang/include -Itools/clang/include -Iinclude -I/home/buildslave/buildslave1<wbr>a/clang-with-lto-ubuntu/llvm.<wbr>src/include -I/home/buildslave/buildslave1<wbr>a/clang-with-lto-ubuntu/llvm.<wbr>src/utils/unittest/googletest/<wbr>include -I/home/buildslave/buildslave1<wbr>a/clang-with-lto-ubuntu/llvm.<wbr>src/utils/unittest/googlemock/<wbr>include -fPIC -fvisibility-inlines-hidden -std=c++11 -Wall -W -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wno-missing-field-initializer<wbr>s -pedantic -Wno-long-long -Wno-maybe-uninitialized -Wdelete-non-virtual-dtor -Wno-comment -ffunction-sections -fdata-sections -fno-common -Woverloaded-virtual -fno-strict-aliasing -O3 -DNDEBUG -Wno-variadic-macros -fno-exceptions -fno-rtti -MD -MT tools/clang/unittests/AST/CMak<wbr>eFiles/ASTTests.dir/CommentTex<wbr>tTest.cpp.o -MF tools/clang/unittests/AST/CMak<wbr>eFiles/ASTTests.dir/CommentTex<wbr>tTest.cpp.o.d -o tools/clang/unittests/AST/CMak<wbr>eFiles/ASTTests.dir/CommentTex<wbr>tTest.cpp.o -c /home/buildslave/buildslave1a/<wbr>clang-with-lto-ubuntu/llvm.src<wbr>/tools/clang/unittests/AST/Com<wbr>mentTextTest.cpp<br>/home/buildslave/buildslave1a/<wbr>clang-with-lto-ubuntu/llvm.src<wbr>/tools/clang/unittests/AST/Com<wbr>mentTextTest.cpp:62:1: error: unterminated raw string<br> R"cpp(<br> ^<br>. . .<br><br>Please have a look?<br><br>The builder was already red and did not send notifications.<br><br>Thanks<span class="gmail-m_-1877726083705554406HOEnZb"><font color="#888888"><br><br>Galina<br><br><br></font></span></div><div class="gmail-m_-1877726083705554406HOEnZb"><div class="gmail-m_-1877726083705554406h5"><div class="gmail_extra"><br><div class="gmail_quote">On Wed, May 16, 2018 at 5:30 AM, Ilya Biryukov via cfe-commits <span dir="ltr"><<a href="mailto:cfe-commits@lists.llvm.org" target="_blank">cfe-commits@lists.llvm.org</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Author: ibiryukov<br>
Date: Wed May 16 05:30:09 2018<br>
New Revision: 332458<br>
<br>
URL: <a href="http://llvm.org/viewvc/llvm-project?rev=332458&view=rev" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-pr<wbr>oject?rev=332458&view=rev</a><br>
Log:<br>
[AST] Added a helper to extract a user-friendly text of a comment.<br>
<br>
Summary:<br>
The helper is used in clangd for documentation shown in code completion<br>
and storing the docs in the symbols. See D45999.<br>
<br>
This patch reuses the code of the Doxygen comment lexer, disabling the<br>
bits that do command and html tag parsing.<br>
The new helper works on all comments, including non-doxygen comments.<br>
However, it does not understand or transform any doxygen directives,<br>
i.e. cannot extract brief text, etc.<br>
<br>
Reviewers: sammccall, hokein, ioeric<br>
<br>
Reviewed By: ioeric<br>
<br>
Subscribers: mgorny, cfe-commits<br>
<br>
Differential Revision: <a href="https://reviews.llvm.org/D46000" rel="noreferrer" target="_blank">https://reviews.llvm.org/D4600<wbr>0</a><br>
<br>
Added:<br>
cfe/trunk/unittests/AST/Commen<wbr>tTextTest.cpp<br>
Modified:<br>
cfe/trunk/include/clang/AST/Co<wbr>mmentLexer.h<br>
cfe/trunk/include/clang/AST/Ra<wbr>wCommentList.h<br>
cfe/trunk/lib/AST/CommentLexer<wbr>.cpp<br>
cfe/trunk/lib/AST/RawCommentLi<wbr>st.cpp<br>
cfe/trunk/unittests/AST/CMakeL<wbr>ists.txt<br>
<br>
Modified: cfe/trunk/include/clang/AST/Co<wbr>mmentLexer.h<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/cfe/trunk/include/clang/AST/CommentLexer.h?rev=332458&r1=332457&r2=332458&view=diff" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-pr<wbr>oject/cfe/trunk/include/clang/<wbr>AST/CommentLexer.h?rev=332458&<wbr>r1=332457&r2=332458&view=diff</a><br>
==============================<wbr>==============================<wbr>==================<br>
--- cfe/trunk/include/clang/AST/Co<wbr>mmentLexer.h (original)<br>
+++ cfe/trunk/include/clang/AST/Co<wbr>mmentLexer.h Wed May 16 05:30:09 2018<br>
@@ -281,6 +281,11 @@ private:<br>
/// command, including command marker.<br>
SmallString<16> VerbatimBlockEndCommandName;<br>
<br>
+ /// If true, the commands, html tags, etc will be parsed and reported as<br>
+ /// separate tokens inside the comment body. If false, the comment text will<br>
+ /// be parsed into text and newline tokens.<br>
+ bool ParseCommands;<br>
+<br>
/// Given a character reference name (e.g., "lt"), return the character that<br>
/// it stands for (e.g., "<").<br>
StringRef resolveHTMLNamedCharacterRefer<wbr>ence(StringRef Name) const;<br>
@@ -315,12 +320,11 @@ private:<br>
/// Eat string matching regexp \code \s*\* \endcode.<br>
void skipLineStartingDecorations();<br>
<br>
- /// Lex stuff inside comments. CommentEnd should be set correctly.<br>
+ /// Lex comment text, including commands if ParseCommands is set to true.<br>
void lexCommentText(Token &T);<br>
<br>
- void setupAndLexVerbatimBlock(Token &T,<br>
- const char *TextBegin,<br>
- char Marker, const CommandInfo *Info);<br>
+ void setupAndLexVerbatimBlock(Token &T, const char *TextBegin, char Marker,<br>
+ const CommandInfo *Info);<br>
<br>
void lexVerbatimBlockFirstLine(Toke<wbr>n &T);<br>
<br>
@@ -343,14 +347,13 @@ private:<br>
<br>
public:<br>
Lexer(llvm::BumpPtrAllocator &Allocator, DiagnosticsEngine &Diags,<br>
- const CommandTraits &Traits,<br>
- SourceLocation FileLoc,<br>
- const char *BufferStart, const char *BufferEnd);<br>
+ const CommandTraits &Traits, SourceLocation FileLoc,<br>
+ const char *BufferStart, const char *BufferEnd,<br>
+ bool ParseCommands = true);<br>
<br>
void lex(Token &T);<br>
<br>
- StringRef getSpelling(const Token &Tok,<br>
- const SourceManager &SourceMgr,<br>
+ StringRef getSpelling(const Token &Tok, const SourceManager &SourceMgr,<br>
bool *Invalid = nullptr) const;<br>
};<br>
<br>
<br>
Modified: cfe/trunk/include/clang/AST/Ra<wbr>wCommentList.h<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/cfe/trunk/include/clang/AST/RawCommentList.h?rev=332458&r1=332457&r2=332458&view=diff" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-pr<wbr>oject/cfe/trunk/include/clang/<wbr>AST/RawCommentList.h?rev=33245<wbr>8&r1=332457&r2=332458&view=<wbr>diff</a><br>
==============================<wbr>==============================<wbr>==================<br>
--- cfe/trunk/include/clang/AST/Ra<wbr>wCommentList.h (original)<br>
+++ cfe/trunk/include/clang/AST/Ra<wbr>wCommentList.h Wed May 16 05:30:09 2018<br>
@@ -111,6 +111,30 @@ public:<br>
return extractBriefText(Context);<br>
}<br>
<br>
+ /// Returns sanitized comment text, suitable for presentation in editor UIs.<br>
+ /// E.g. will transform:<br>
+ /// // This is a long multiline comment.<br>
+ /// // Parts of it might be indented.<br>
+ /// /* The comments styles might be mixed. */<br>
+ /// into<br>
+ /// "This is a long multiline comment.\n"<br>
+ /// " Parts of it might be indented.\n"<br>
+ /// "The comments styles might be mixed."<br>
+ /// Also removes leading indentation and sanitizes some common cases:<br>
+ /// /* This is a first line.<br>
+ /// * This is a second line. It is indented.<br>
+ /// * This is a third line. */<br>
+ /// and<br>
+ /// /* This is a first line.<br>
+ /// This is a second line. It is indented.<br>
+ /// This is a third line. */<br>
+ /// will both turn into:<br>
+ /// "This is a first line.\n"<br>
+ /// " This is a second line. It is indented.\n"<br>
+ /// "This is a third line."<br>
+ std::string getFormattedText(const SourceManager &SourceMgr,<br>
+ DiagnosticsEngine &Diags) const;<br>
+<br>
/// Parse the comment, assuming it is attached to decl \c D.<br>
comments::FullComment *parse(const ASTContext &Context,<br>
const Preprocessor *PP, const Decl *D) const;<br>
<br>
Modified: cfe/trunk/lib/AST/CommentLexer<wbr>.cpp<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/AST/CommentLexer.cpp?rev=332458&r1=332457&r2=332458&view=diff" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-pr<wbr>oject/cfe/trunk/lib/AST/Commen<wbr>tLexer.cpp?rev=332458&r1=33245<wbr>7&r2=332458&view=diff</a><br>
==============================<wbr>==============================<wbr>==================<br>
--- cfe/trunk/lib/AST/CommentLexer<wbr>.cpp (original)<br>
+++ cfe/trunk/lib/AST/CommentLexer<wbr>.cpp Wed May 16 05:30:09 2018<br>
@@ -294,6 +294,39 @@ void Lexer::lexCommentText(Token &T) {<br>
assert(CommentState == LCS_InsideBCPLComment ||<br>
CommentState == LCS_InsideCComment);<br>
<br>
+ // Handles lexing non-command text, i.e. text and newline.<br>
+ auto HandleNonCommandToken = [&]() -> void {<br>
+ assert(State == LS_Normal);<br>
+<br>
+ const char *TokenPtr = BufferPtr;<br>
+ assert(TokenPtr < CommentEnd);<br>
+ switch (*TokenPtr) {<br>
+ case '\n':<br>
+ case '\r':<br>
+ TokenPtr = skipNewline(TokenPtr, CommentEnd);<br>
+ formTokenWithChars(T, TokenPtr, tok::newline);<br>
+<br>
+ if (CommentState == LCS_InsideCComment)<br>
+ skipLineStartingDecorations();<br>
+ return;<br>
+<br>
+ default: {<br>
+ StringRef TokStartSymbols = ParseCommands ? "\n\r\\@&<" : "\n\r";<br>
+ size_t End = StringRef(TokenPtr, CommentEnd - TokenPtr)<br>
+ .find_first_of(TokStartSymbol<wbr>s);<br>
+ if (End != StringRef::npos)<br>
+ TokenPtr += End;<br>
+ else<br>
+ TokenPtr = CommentEnd;<br>
+ formTextToken(T, TokenPtr);<br>
+ return;<br>
+ }<br>
+ }<br>
+ };<br>
+<br>
+ if (!ParseCommands)<br>
+ return HandleNonCommandToken();<br>
+<br>
switch (State) {<br>
case LS_Normal:<br>
break;<br>
@@ -315,136 +348,116 @@ void Lexer::lexCommentText(Token &T) {<br>
}<br>
<br>
assert(State == LS_Normal);<br>
-<br>
const char *TokenPtr = BufferPtr;<br>
assert(TokenPtr < CommentEnd);<br>
- while (TokenPtr != CommentEnd) {<br>
- switch(*TokenPtr) {<br>
- case '\\':<br>
- case '@': {<br>
- // Commands that start with a backslash and commands that start with<br>
- // 'at' have equivalent semantics. But we keep information about the<br>
- // exact syntax in AST for comments.<br>
- tok::TokenKind CommandKind =<br>
- (*TokenPtr == '@') ? tok::at_command : tok::backslash_command;<br>
+ switch(*TokenPtr) {<br>
+ case '\\':<br>
+ case '@': {<br>
+ // Commands that start with a backslash and commands that start with<br>
+ // 'at' have equivalent semantics. But we keep information about the<br>
+ // exact syntax in AST for comments.<br>
+ tok::TokenKind CommandKind =<br>
+ (*TokenPtr == '@') ? tok::at_command : tok::backslash_command;<br>
+ TokenPtr++;<br>
+ if (TokenPtr == CommentEnd) {<br>
+ formTextToken(T, TokenPtr);<br>
+ return;<br>
+ }<br>
+ char C = *TokenPtr;<br>
+ switch (C) {<br>
+ default:<br>
+ break;<br>
+<br>
+ case '\\': case '@': case '&': case '$':<br>
+ case '#': case '<': case '>': case '%':<br>
+ case '\"': case '.': case ':':<br>
+ // This is one of \\ \@ \& \$ etc escape sequences.<br>
TokenPtr++;<br>
- if (TokenPtr == CommentEnd) {<br>
- formTextToken(T, TokenPtr);<br>
- return;<br>
- }<br>
- char C = *TokenPtr;<br>
- switch (C) {<br>
- default:<br>
- break;<br>
-<br>
- case '\\': case '@': case '&': case '$':<br>
- case '#': case '<': case '>': case '%':<br>
- case '\"': case '.': case ':':<br>
- // This is one of \\ \@ \& \$ etc escape sequences.<br>
+ if (C == ':' && TokenPtr != CommentEnd && *TokenPtr == ':') {<br>
+ // This is the \:: escape sequence.<br>
TokenPtr++;<br>
- if (C == ':' && TokenPtr != CommentEnd && *TokenPtr == ':') {<br>
- // This is the \:: escape sequence.<br>
- TokenPtr++;<br>
- }<br>
- StringRef UnescapedText(BufferPtr + 1, TokenPtr - (BufferPtr + 1));<br>
- formTokenWithChars(T, TokenPtr, tok::text);<br>
- T.setText(UnescapedText);<br>
- return;<br>
}<br>
+ StringRef UnescapedText(BufferPtr + 1, TokenPtr - (BufferPtr + 1));<br>
+ formTokenWithChars(T, TokenPtr, tok::text);<br>
+ T.setText(UnescapedText);<br>
+ return;<br>
+ }<br>
<br>
- // Don't make zero-length commands.<br>
- if (!isCommandNameStartCharacter(<wbr>*TokenPtr)) {<br>
- formTextToken(T, TokenPtr);<br>
- return;<br>
- }<br>
+ // Don't make zero-length commands.<br>
+ if (!isCommandNameStartCharacter(<wbr>*TokenPtr)) {<br>
+ formTextToken(T, TokenPtr);<br>
+ return;<br>
+ }<br>
<br>
- TokenPtr = skipCommandName(TokenPtr, CommentEnd);<br>
- unsigned Length = TokenPtr - (BufferPtr + 1);<br>
+ TokenPtr = skipCommandName(TokenPtr, CommentEnd);<br>
+ unsigned Length = TokenPtr - (BufferPtr + 1);<br>
<br>
- // Hardcoded support for lexing LaTeX formula commands<br>
- // \f$ \f[ \f] \f{ \f} as a single command.<br>
- if (Length == 1 && TokenPtr[-1] == 'f' && TokenPtr != CommentEnd) {<br>
- C = *TokenPtr;<br>
- if (C == '$' || C == '[' || C == ']' || C == '{' || C == '}') {<br>
- TokenPtr++;<br>
- Length++;<br>
- }<br>
+ // Hardcoded support for lexing LaTeX formula commands<br>
+ // \f$ \f[ \f] \f{ \f} as a single command.<br>
+ if (Length == 1 && TokenPtr[-1] == 'f' && TokenPtr != CommentEnd) {<br>
+ C = *TokenPtr;<br>
+ if (C == '$' || C == '[' || C == ']' || C == '{' || C == '}') {<br>
+ TokenPtr++;<br>
+ Length++;<br>
}<br>
+ }<br>
<br>
- StringRef CommandName(BufferPtr + 1, Length);<br>
+ StringRef CommandName(BufferPtr + 1, Length);<br>
<br>
- const CommandInfo *Info = Traits.getCommandInfoOrNULL(Co<wbr>mmandName);<br>
- if (!Info) {<br>
- if ((Info = Traits.getTypoCorrectCommandIn<wbr>fo(CommandName))) {<br>
- StringRef CorrectedName = Info->Name;<br>
- SourceLocation Loc = getSourceLocation(BufferPtr);<br>
- SourceLocation EndLoc = getSourceLocation(TokenPtr);<br>
- SourceRange FullRange = SourceRange(Loc, EndLoc);<br>
- SourceRange CommandRange(Loc.getLocWithOff<wbr>set(1), EndLoc);<br>
- Diag(Loc, diag::warn_correct_comment_com<wbr>mand_name)<br>
- << FullRange << CommandName << CorrectedName<br>
- << FixItHint::CreateReplacement(C<wbr>ommandRange, CorrectedName);<br>
- } else {<br>
- formTokenWithChars(T, TokenPtr, tok::unknown_command);<br>
- T.setUnknownCommandName(Comman<wbr>dName);<br>
- Diag(T.getLocation(), diag::warn_unknown_comment_com<wbr>mand_name)<br>
- << SourceRange(T.getLocation(), T.getEndLocation());<br>
- return;<br>
- }<br>
- }<br>
- if (Info->IsVerbatimBlockCommand) {<br>
- setupAndLexVerbatimBlock(T, TokenPtr, *BufferPtr, Info);<br>
+ const CommandInfo *Info = Traits.getCommandInfoOrNULL(Co<wbr>mmandName);<br>
+ if (!Info) {<br>
+ if ((Info = Traits.getTypoCorrectCommandIn<wbr>fo(CommandName))) {<br>
+ StringRef CorrectedName = Info->Name;<br>
+ SourceLocation Loc = getSourceLocation(BufferPtr);<br>
+ SourceLocation EndLoc = getSourceLocation(TokenPtr);<br>
+ SourceRange FullRange = SourceRange(Loc, EndLoc);<br>
+ SourceRange CommandRange(Loc.getLocWithOff<wbr>set(1), EndLoc);<br>
+ Diag(Loc, diag::warn_correct_comment_com<wbr>mand_name)<br>
+ << FullRange << CommandName << CorrectedName<br>
+ << FixItHint::CreateReplacement(C<wbr>ommandRange, CorrectedName);<br>
+ } else {<br>
+ formTokenWithChars(T, TokenPtr, tok::unknown_command);<br>
+ T.setUnknownCommandName(Comman<wbr>dName);<br>
+ Diag(T.getLocation(), diag::warn_unknown_comment_com<wbr>mand_name)<br>
+ << SourceRange(T.getLocation(), T.getEndLocation());<br>
return;<br>
}<br>
- if (Info->IsVerbatimLineCommand) {<br>
- setupAndLexVerbatimLine(T, TokenPtr, Info);<br>
- return;<br>
- }<br>
- formTokenWithChars(T, TokenPtr, CommandKind);<br>
- T.setCommandID(Info->getID());<br>
- return;<br>
}<br>
-<br>
- case '&':<br>
- lexHTMLCharacterReference(T);<br>
+ if (Info->IsVerbatimBlockCommand) {<br>
+ setupAndLexVerbatimBlock(T, TokenPtr, *BufferPtr, Info);<br>
return;<br>
-<br>
- case '<': {<br>
- TokenPtr++;<br>
- if (TokenPtr == CommentEnd) {<br>
- formTextToken(T, TokenPtr);<br>
- return;<br>
- }<br>
- const char C = *TokenPtr;<br>
- if (isHTMLIdentifierStartingChara<wbr>cter(C))<br>
- setupAndLexHTMLStartTag(T);<br>
- else if (C == '/')<br>
- setupAndLexHTMLEndTag(T);<br>
- else<br>
- formTextToken(T, TokenPtr);<br>
+ }<br>
+ if (Info->IsVerbatimLineCommand) {<br>
+ setupAndLexVerbatimLine(T, TokenPtr, Info);<br>
return;<br>
}<br>
+ formTokenWithChars(T, TokenPtr, CommandKind);<br>
+ T.setCommandID(Info->getID());<br>
+ return;<br>
+ }<br>
<br>
- case '\n':<br>
- case '\r':<br>
- TokenPtr = skipNewline(TokenPtr, CommentEnd);<br>
- formTokenWithChars(T, TokenPtr, tok::newline);<br>
-<br>
- if (CommentState == LCS_InsideCComment)<br>
- skipLineStartingDecorations();<br>
- return;<br>
+ case '&':<br>
+ lexHTMLCharacterReference(T);<br>
+ return;<br>
<br>
- default: {<br>
- size_t End = StringRef(TokenPtr, CommentEnd - TokenPtr).<br>
- find_first_of("\n\r\\@&<");<br>
- if (End != StringRef::npos)<br>
- TokenPtr += End;<br>
- else<br>
- TokenPtr = CommentEnd;<br>
+ case '<': {<br>
+ TokenPtr++;<br>
+ if (TokenPtr == CommentEnd) {<br>
formTextToken(T, TokenPtr);<br>
return;<br>
}<br>
+ const char C = *TokenPtr;<br>
+ if (isHTMLIdentifierStartingChara<wbr>cter(C))<br>
+ setupAndLexHTMLStartTag(T);<br>
+ else if (C == '/')<br>
+ setupAndLexHTMLEndTag(T);<br>
+ else<br>
+ formTextToken(T, TokenPtr);<br>
+ return;<br>
}<br>
+<br>
+ default:<br>
+ return HandleNonCommandToken();<br>
}<br>
}<br>
<br>
@@ -727,14 +740,13 @@ void Lexer::lexHTMLEndTag(Token &T) {<br>
}<br>
<br>
Lexer::Lexer(llvm::BumpPtrAll<wbr>ocator &Allocator, DiagnosticsEngine &Diags,<br>
- const CommandTraits &Traits,<br>
- SourceLocation FileLoc,<br>
- const char *BufferStart, const char *BufferEnd):<br>
- Allocator(Allocator), Diags(Diags), Traits(Traits),<br>
- BufferStart(BufferStart), BufferEnd(BufferEnd),<br>
- FileLoc(FileLoc), BufferPtr(BufferStart),<br>
- CommentState(LCS_BeforeComment<wbr>), State(LS_Normal) {<br>
-}<br>
+ const CommandTraits &Traits, SourceLocation FileLoc,<br>
+ const char *BufferStart, const char *BufferEnd,<br>
+ bool ParseCommands)<br>
+ : Allocator(Allocator), Diags(Diags), Traits(Traits),<br>
+ BufferStart(BufferStart), BufferEnd(BufferEnd), FileLoc(FileLoc),<br>
+ BufferPtr(BufferStart), CommentState(LCS_BeforeComment<wbr>), State(LS_Normal),<br>
+ ParseCommands(ParseCommands) {}<br>
<br>
void Lexer::lex(Token &T) {<br>
again:<br>
<br>
Modified: cfe/trunk/lib/AST/RawCommentLi<wbr>st.cpp<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/AST/RawCommentList.cpp?rev=332458&r1=332457&r2=332458&view=diff" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-pr<wbr>oject/cfe/trunk/lib/AST/RawCom<wbr>mentList.cpp?rev=332458&r1=332<wbr>457&r2=332458&view=diff</a><br>
==============================<wbr>==============================<wbr>==================<br>
--- cfe/trunk/lib/AST/RawCommentLi<wbr>st.cpp (original)<br>
+++ cfe/trunk/lib/AST/RawCommentLi<wbr>st.cpp Wed May 16 05:30:09 2018<br>
@@ -335,3 +335,94 @@ void RawCommentList::addDeserialize<wbr>dComm<br>
BeforeThanCompare<RawComment>(<wbr>SourceMgr));<br>
std::swap(Comments, MergedComments);<br>
}<br>
+<br>
+std::string RawComment::getFormattedText(c<wbr>onst SourceManager &SourceMgr,<br>
+ DiagnosticsEngine &Diags) const {<br>
+ llvm::StringRef CommentText = getRawText(SourceMgr);<br>
+ if (CommentText.empty())<br>
+ return "";<br>
+<br>
+ llvm::BumpPtrAllocator Allocator;<br>
+ // We do not parse any commands, so CommentOptions are ignored by<br>
+ // comments::Lexer. Therefore, we just use default-constructed options.<br>
+ CommentOptions DefOpts;<br>
+ comments::CommandTraits EmptyTraits(Allocator, DefOpts);<br>
+ comments::Lexer L(Allocator, Diags, EmptyTraits, getSourceRange().getBegin(),<br>
+ CommentText.begin(), CommentText.end(),<br>
+ /*ParseCommands=*/false);<br>
+<br>
+ std::string Result;<br>
+ // A column number of the first non-whitespace token in the comment text.<br>
+ // We skip whitespace up to this column, but keep the whitespace after this<br>
+ // column. IndentColumn is calculated when lexing the first line and reused<br>
+ // for the rest of lines.<br>
+ unsigned IndentColumn = 0;<br>
+<br>
+ // Processes one line of the comment and adds it to the result.<br>
+ // Handles skipping the indent at the start of the line.<br>
+ // Returns false when eof is reached and true otherwise.<br>
+ auto LexLine = [&](bool IsFirstLine) -> bool {<br>
+ comments::Token Tok;<br>
+ // Lex the first token on the line. We handle it separately, because we to<br>
+ // fix up its indentation.<br>
+ L.lex(Tok);<br>
+ if (Tok.is(comments::tok::eof))<br>
+ return false;<br>
+ if (Tok.is(comments::tok::newline<wbr>)) {<br>
+ Result += "\n";<br>
+ return true;<br>
+ }<br>
+ llvm::StringRef TokText = L.getSpelling(Tok, SourceMgr);<br>
+ bool LocInvalid = false;<br>
+ unsigned TokColumn =<br>
+ SourceMgr.getSpellingColumnNum<wbr>ber(Tok.getLocation(), &LocInvalid);<br>
+ assert(!LocInvalid && "getFormattedText for invalid location");<br>
+<br>
+ // Amount of leading whitespace in TokText.<br>
+ size_t WhitespaceLen = TokText.find_first_not_of(" \t");<br>
+ if (WhitespaceLen == StringRef::npos)<br>
+ WhitespaceLen = TokText.size();<br>
+ // Remember the amount of whitespace we skipped in the first line to remove<br>
+ // indent up to that column in the following lines.<br>
+ if (IsFirstLine)<br>
+ IndentColumn = TokColumn + WhitespaceLen;<br>
+<br>
+ // Amount of leading whitespace we actually want to skip.<br>
+ // For the first line we skip all the whitespace.<br>
+ // For the rest of the lines, we skip whitespace up to IndentColumn.<br>
+ unsigned SkipLen =<br>
+ IsFirstLine<br>
+ ? WhitespaceLen<br>
+ : std::min<size_t>(<br>
+ WhitespaceLen,<br>
+ std::max<int>(static_cast<int><wbr>(IndentColumn) - TokColumn, 0));<br>
+ llvm::StringRef Trimmed = TokText.drop_front(SkipLen);<br>
+ Result += Trimmed;<br>
+ // Lex all tokens in the rest of the line.<br>
+ for (L.lex(Tok); Tok.isNot(comments::tok::eof); L.lex(Tok)) {<br>
+ if (Tok.is(comments::tok::newline<wbr>)) {<br>
+ Result += "\n";<br>
+ return true;<br>
+ }<br>
+ Result += L.getSpelling(Tok, SourceMgr);<br>
+ }<br>
+ // We've reached the end of file token.<br>
+ return false;<br>
+ };<br>
+<br>
+ auto DropTrailingNewLines = [](std::string &Str) {<br>
+ while (Str.back() == '\n')<br>
+ Str.pop_back();<br>
+ };<br>
+<br>
+ // Proces first line separately to remember indent for the following lines.<br>
+ if (!LexLine(/*IsFirstLine=*/true<wbr>)) {<br>
+ DropTrailingNewLines(Result);<br>
+ return Result;<br>
+ }<br>
+ // Process the rest of the lines.<br>
+ while (LexLine(/*IsFirstLine=*/false<wbr>))<br>
+ ;<br>
+ DropTrailingNewLines(Result);<br>
+ return Result;<br>
+}<br>
<br>
Modified: cfe/trunk/unittests/AST/CMakeL<wbr>ists.txt<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/cfe/trunk/unittests/AST/CMakeLists.txt?rev=332458&r1=332457&r2=332458&view=diff" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-pr<wbr>oject/cfe/trunk/unittests/AST/<wbr>CMakeLists.txt?rev=332458&r1=3<wbr>32457&r2=332458&view=diff</a><br>
==============================<wbr>==============================<wbr>==================<br>
--- cfe/trunk/unittests/AST/CMakeL<wbr>ists.txt (original)<br>
+++ cfe/trunk/unittests/AST/CMakeL<wbr>ists.txt Wed May 16 05:30:09 2018<br>
@@ -9,6 +9,7 @@ add_clang_unittest(ASTTests<br>
ASTVectorTest.cpp<br>
CommentLexer.cpp<br>
CommentParser.cpp<br>
+ CommentTextTest.cpp<br>
DataCollectionTest.cpp<br>
DeclPrinterTest.cpp<br>
DeclTest.cpp<br>
<br>
Added: cfe/trunk/unittests/AST/Commen<wbr>tTextTest.cpp<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/cfe/trunk/unittests/AST/CommentTextTest.cpp?rev=332458&view=auto" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-pr<wbr>oject/cfe/trunk/unittests/AST/<wbr>CommentTextTest.cpp?rev=332458<wbr>&view=auto</a><br>
==============================<wbr>==============================<wbr>==================<br>
--- cfe/trunk/unittests/AST/Commen<wbr>tTextTest.cpp (added)<br>
+++ cfe/trunk/unittests/AST/Commen<wbr>tTextTest.cpp Wed May 16 05:30:09 2018<br>
@@ -0,0 +1,122 @@<br>
+//===- unittest/AST/CommentTextTest.c<wbr>pp - Comment text extraction test ----===//<br>
+//<br>
+// The LLVM Compiler Infrastructure<br>
+//<br>
+// This file is distributed under the University of Illinois Open Source<br>
+// License. See LICENSE.TXT for details.<br>
+//<br>
+//===------------------------<wbr>------------------------------<wbr>----------------===//<br>
+//<br>
+// Tests for user-friendly output formatting of comments, i.e.<br>
+// RawComment::getFormattedText()<wbr>.<br>
+//<br>
+//===------------------------<wbr>------------------------------<wbr>----------------===//<br>
+<br>
+#include "clang/AST/RawCommentList.h"<br>
+#include "clang/Basic/CommentOptions.h"<br>
+#include "clang/Basic/Diagnostic.h"<br>
+#include "clang/Basic/DiagnosticIDs.h"<br>
+#include "clang/Basic/FileManager.h"<br>
+#include "clang/Basic/FileSystemOptions<wbr>.h"<br>
+#include "clang/Basic/SourceLocation.h"<br>
+#include "clang/Basic/SourceManager.h"<br>
+#include "clang/Basic/VirtualFileSystem<wbr>.h"<br>
+#include "llvm/Support/MemoryBuffer.h"<br>
+#include <gtest/gtest.h><br>
+<br>
+namespace clang {<br>
+<br>
+class CommentTextTest : public ::testing::Test {<br>
+protected:<br>
+ std::string formatComment(llvm::StringRef CommentText) {<br>
+ SourceManagerForFile FileSourceMgr("comment-test.cp<wbr>p", CommentText);<br>
+ SourceManager& SourceMgr = FileSourceMgr.get();<br>
+<br>
+ auto CommentStartOffset = CommentText.find("/");<br>
+ assert(CommentStartOffset != llvm::StringRef::npos);<br>
+ FileID File = SourceMgr.getMainFileID();<br>
+<br>
+ SourceRange CommentRange(<br>
+ SourceMgr.getLocForStartOfFile<wbr>(File).getLocWithOffset(<br>
+ CommentStartOffset),<br>
+ SourceMgr.getLocForEndOfFile(F<wbr>ile));<br>
+ CommentOptions EmptyOpts;<br>
+ // FIXME: technically, merged that we set here is incorrect, but that<br>
+ // shouldn't matter.<br>
+ RawComment Comment(SourceMgr, CommentRange, EmptyOpts, /*Merged=*/true);<br>
+ DiagnosticsEngine Diags(new DiagnosticIDs, new DiagnosticOptions);<br>
+ return Comment.getFormattedText(Sourc<wbr>eMgr, Diags);<br>
+ }<br>
+};<br>
+<br>
+TEST_F(CommentTextTest, FormattedText) {<br>
+ // clang-format off<br>
+ auto ExpectedOutput =<br>
+R"(This function does this and that.<br>
+For example,<br>
+ Runnning it in that case will give you<br>
+ this result.<br>
+That's about it.)";<br>
+ // Two-slash comments.<br>
+ EXPECT_EQ(ExpectedOutput, formatComment(<br>
+R"cpp(<br>
+// This function does this and that.<br>
+// For example,<br>
+// Runnning it in that case will give you<br>
+// this result.<br>
+// That's about it.)cpp"));<br>
+<br>
+ // Three-slash comments.<br>
+ EXPECT_EQ(ExpectedOutput, formatComment(<br>
+R"cpp(<br>
+/// This function does this and that.<br>
+/// For example,<br>
+/// Runnning it in that case will give you<br>
+/// this result.<br>
+/// That's about it.)cpp"));<br>
+<br>
+ // Block comments.<br>
+ EXPECT_EQ(ExpectedOutput, formatComment(<br>
+R"cpp(<br>
+/* This function does this and that.<br>
+ * For example,<br>
+ * Runnning it in that case will give you<br>
+ * this result.<br>
+ * That's about it.*/)cpp"));<br>
+<br>
+ // Doxygen-style block comments.<br>
+ EXPECT_EQ(ExpectedOutput, formatComment(<br>
+R"cpp(<br>
+/** This function does this and that.<br>
+ * For example,<br>
+ * Runnning it in that case will give you<br>
+ * this result.<br>
+ * That's about it.*/)cpp"));<br>
+<br>
+ // Weird indentation.<br>
+ EXPECT_EQ(ExpectedOutput, formatComment(<br>
+R"cpp(<br>
+ // This function does this and that.<br>
+ // For example,<br>
+ // Runnning it in that case will give you<br>
+ // this result.<br>
+ // That's about it.)cpp"));<br>
+ // clang-format on<br>
+}<br>
+<br>
+TEST_F(CommentTextTest, KeepsDoxygenControlSeqs) {<br>
+ // clang-format off<br>
+ auto ExpectedOutput =<br>
+R"(\brief This is the brief part of the comment.<br>
+\param a something about a.<br>
+@param b something about b.)";<br>
+<br>
+ EXPECT_EQ(ExpectedOutput, formatComment(<br>
+R"cpp(<br>
+/// \brief This is the brief part of the comment.<br>
+/// \param a something about a.<br>
+/// @param b something about b.)cpp"));<br>
+ // clang-format on<br>
+}<br>
+<br>
+} // namespace clang<br>
<br>
<br>
______________________________<wbr>_________________<br>
cfe-commits mailing list<br>
<a href="mailto:cfe-commits@lists.llvm.org" target="_blank">cfe-commits@lists.llvm.org</a><br>
<a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits" rel="noreferrer" target="_blank">http://lists.llvm.org/cgi-bin/<wbr>mailman/listinfo/cfe-commits</a><br>
</blockquote></div><br></div>
</div></div></blockquote></div><br></div>
</div></div><br>______________________________<wbr>_________________<br>
cfe-commits mailing list<br>
<a href="mailto:cfe-commits@lists.llvm.org">cfe-commits@lists.llvm.org</a><br>
<a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits" rel="noreferrer" target="_blank">http://lists.llvm.org/cgi-bin/<wbr>mailman/listinfo/cfe-commits</a><br>
<br></blockquote></div><br></div>