[PATCH] D46000: [AST] Added a helper to extract a user-friendly text of a comment.
Eric Liu via Phabricator via cfe-commits
cfe-commits at lists.llvm.org
Wed Apr 25 11:41:26 PDT 2018
ioeric added inline comments.
================
Comment at: include/clang/AST/RawCommentList.h:118
+ /// // Parts of it might be indented.
+ /// /* The comments styles might be mixed. */
+ /// into
----------------
I'm trying to understand how these cases and RawComment work.
For this case, are the `// ... ` block and `/* ... */` merged in one `RawComment` by default?
================
Comment at: include/clang/AST/RawCommentList.h:126
+ /// * This is a second line. It is indented.
+ /// * This is a third line. */
+ /// and
----------------
Are the `*`s in each lines automatically consumed by the lexer?
================
Comment at: lib/AST/CommentLexer.cpp:301
+
+bool Lexer::tryLexCommands(Token &T) {
assert(CommentState == LCS_InsideBCPLComment ||
----------------
I think we could avoid the "somewhat non-trivial" control flow by merging command and command-less cases in one function:
Something like:
```
const char *TokenPtr = ...;
auto HandleNonCommandToken = ...;
if (!ParseComand) {
HandleNonCommandToken(...);
return;
}
...
switch (...) {
// after all command cases
default:
HandleNonCommandToken(...);
}
```
================
Comment at: lib/AST/CommentLexer.cpp:331
assert(TokenPtr < CommentEnd);
- while (TokenPtr != CommentEnd) {
- switch(*TokenPtr) {
- case '\\':
- case '@': {
- // Commands that start with a backslash and commands that start with
- // 'at' have equivalent semantics. But we keep information about the
- // exact syntax in AST for comments.
- tok::TokenKind CommandKind =
- (*TokenPtr == '@') ? tok::at_command : tok::backslash_command;
+ switch(*TokenPtr) {
+ case '\\':
----------------
We should be extra careful about removing the loop... (It does seem to be redundant though)
================
Comment at: lib/AST/RawCommentList.cpp:380
+ SourceMgr.getSpellingColumnNumber(Tok.getLocation(), &LocInvalid);
+ if (LocInvalid)
+ TokColumn = 0;
----------------
ilya-biryukov wrote:
> ilya-biryukov wrote:
> > ioeric wrote:
> > > Explain when this would be invalid and why `TokColumn = 0` is used?
> > I don't know whether this can be even be invalid, but I'm not confident enough to add an assert there.
> > `TokColumn = 0` seems like a reasonable way to recover if we can't compute the column number, i.e. assume the line starts at the first column if SourceLocation of the line was invalid for any reason.
> >
> > This whole column thing looks weird to me, maybe I should just remove it altogether and just remove the same amount of whitespace in all the lines. WDYT?
> On a second thought, now I remember why I added this in the first place.
> To support the following example we want to take column numbers into account:
>
> ```
> class Foo {
> /* A block comment
> spanning multiple lines
> has too many spaces on the all lines except the first one.
> */
> int func();
> };
> ```
So I think you would want to force `MaxSkip` to 0 if token loc is invalid to make sure no comment is accidentally eaten?
================
Comment at: lib/AST/RawCommentList.cpp:383
+ // Compute the length of whitespace we're allowed to skip.
+ unsigned MaxSkip;
+ if (IsFirstLine) {
----------------
ilya-biryukov wrote:
> ioeric wrote:
> > nit: `unsigned MaxSkip = IsFirstLine ? ... : ...;`
> That would force to get rid of the comments in the if branches, but they seem to be useful.
> Am I missing an obvious style that would preserve the comments?
You could simply merge the comments, which doesn't seem to compromise readability.
================
Comment at: lib/AST/RawCommentList.cpp:343
+ if (CommentText.empty())
+ return ""; // we couldn't retreive the comment.
+
----------------
nit: We don't really know if there was a failure or the comment is simply empty. So I'd probably leave out the comment here to avoid confusion.
================
Comment at: lib/AST/RawCommentList.cpp:349
+ CommentText.begin(), CommentText.end(),
+ /*ParseCommentText=*/false);
+
----------------
`s/ParseCommentText/ParseCommands/`?
================
Comment at: lib/AST/RawCommentList.cpp:352
+ std::string Result;
+ unsigned IndentColumn = 0;
+
----------------
This variable could use a comment.
================
Comment at: lib/AST/RawCommentList.cpp:393
+
+ llvm::StringRef Trimmed = TokText.drop_front(std::min(MaxSkip, WhitespaceLen));
+ Result += Trimmed;
----------------
I think `MaxSkip` could be removed with:
```
llvm::StringRef Trimmed = TokText.drop_front(IsFirstLine ? WhitespaceLen :
std::max((int)IndentColumn - (int)TokColumn, 0)));
```
Repository:
rC Clang
https://reviews.llvm.org/D46000
More information about the cfe-commits
mailing list