[clang] [Clang][Frontend] Fix a crash when -Wdocumentation is used (PR #68525)

Daniel Höpfl via cfe-commits cfe-commits at lists.llvm.org
Sat Oct 14 12:06:05 PDT 2023


dhoepfl wrote:

> Since **I'm not an expert in clang AST**, it is hard to reduce the failing cases.

Double so for me.

> So the crash is happened because the result of `DeclLocDecomp.second - CommentEndOffset` is overflowed, so operations over `StringRef Text` is making the crash.
> 
> **The best way to fix this issue** is to find out why they are not from the same source code and fix it. However, I'm not sure how to fix it, so I've made a patch to avoid the crash.

While your fix resolves the specific crash we noticed, it does not fix the underlying problem. I found that the source of the problem is noted in [this comment](https://github.com/llvm/llvm-project/blob/80737d2ddf05507d96cdd723fb33a6e44ac72a48/clang/lib/Sema/SemaDecl.cpp#L14892).

I tried to “fix” this by removing a few lines starting [here](https://github.com/llvm/llvm-project/blob/80737d2ddf05507d96cdd723fb33a6e44ac72a48/clang/lib/AST/ASTContext.cpp#L575) (maybe you could also remove lines 561-574?) and moving them a few lines down to get:

```
void ASTContext::attachCommentsToJustParsedDecls(ArrayRef<Decl *> Decls,
                                                 const Preprocessor *PP) {
  if (Comments.empty() || Decls.empty())
    return;

  FileID File;
  for (Decl *D : Decls) {
    SourceLocation Loc = D->getLocation();
    if (Loc.isValid()) {
      // See if there are any new comments that are not attached to a decl.
      // The location doesn't have to be precise - we care only about the file.
      File = SourceMgr.getDecomposedLoc(Loc).first;
      break;
    }
  }

  if (File.isInvalid())
    return;

/* REMOVED HERE */

  // There is at least one comment not attached to a decl.
  // Maybe it should be attached to one of Decls?
  //
  // Note that this way we pick up not only comments that precede the
  // declaration, but also comments that *follow* the declaration -- thanks to
  // the lookahead in the lexer: we've consumed the semicolon and looked
  // ahead through comments.

  for (const Decl *D : Decls) {
    assert(D);
    if (D->isInvalidDecl())
      continue;

    D = &adjustDeclToTemplate(*D);

    const SourceLocation DeclLoc = getDeclLocForCommentSearch(D, SourceMgr);

    if (DeclLoc.isInvalid() || !DeclLoc.isFileID())
      continue;

    if (DeclRawComments.count(D) > 0)
      continue;

/* ADDED THIS */
    FileID File = SourceMgr.getDecomposedLoc(DeclLoc).first;
    if (File.isInvalid())
      continue;

    auto CommentsInThisFile = Comments.getCommentsInFile(File);
    if (!CommentsInThisFile || CommentsInThisFile->empty() ||
        CommentsInThisFile->rbegin()->second->isAttached())
      continue;
/* UNTIL HERE */

    if (RawComment *const DocComment =
            getRawCommentForDeclNoCacheImpl(D, DeclLoc, *CommentsInThisFile)) {
      cacheRawCommentForDecl(*D, *DocComment);
      comments::FullComment *FC = DocComment->parse(*this, PP, D);
      ParsedComments[D->getCanonicalDecl()] = FC;
    }
  }
}
```

That way it no longer crashes … I think this loads the correct comments for each decl but I am not sure if this is the correct way to fix it (or even if it still does what it is supposed to do).


https://github.com/llvm/llvm-project/pull/68525


More information about the cfe-commits mailing list