[cfe-dev] Clang 3.2 assertion failure reading AST files: TokenID != tok::identifier && "Already at tok::identifier

Argyrios Kyrtzidis akyrtzi at gmail.com
Mon Mar 18 11:54:46 PDT 2013


This is fixed in r176148

-Argyrios

On Feb 26, 2013, at 2:15 PM, Tom Honermann <thonermann at coverity.com> wrote:

> The following code causes Clang (3.2 on Linux) to fail an assertion test when deserializing an AST from a PCH file.  Note that the identifier (__is_void) for the struct matches a Clang keyword.
> 
> struct __is_void {
>  int val;
> } a = { 42 };
> 
> $ clang --version
> clang version 3.2 (tags/RELEASE_32/final)
> Target: x86_64-unknown-linux-gnu
> Thread model: posix
> 
> $ clang -c __is_void.cpp
> <no error, object file is generated successfully>
> 
> $ clang -emit-ast __is_void.cpp
> <no error, AST file is generated successfully>
> 
> $ clang -c __is_void.ast
> clang: include/clang/Basic/IdentifierTable.h:168: void clang::IdentifierInfo::RevertTokenIDToIdentifier(): Assertion `TokenID != tok::identifier && "Already at tok::identifier"' failed.
> clang: error: unable to execute command: Segmentation fault (core dumped)
> clang: error: clang frontend command failed due to signal (use -v to see invocation)
> clang version 3.2 (tags/RELEASE_32/final)
> Target: x86_64-unknown-linux-gnu
> Thread model: posix
> clang: note: diagnostic msg: PLEASE submit a bug report to http://llvm.org/bugs/ and include the crash backtrace, preprocessed source, and associated run script.
> clang: note: diagnostic msg: Error generating preprocessed source(s) - no preprocessable inputs.
> 
> This assertion failure (with a different test case) was previously reported here:
>  http://llvm.org/bugs/show_bug.cgi?id=13020
>  Bug 13020 - Clang 3.1 assertion failures reading and writing AST files
> 
> The assertion failure occurs here:
> 
> include/clang/Basic/IdentifierTable.h:
> 167   void RevertTokenIDToIdentifier() {
> 168     assert(TokenID != tok::identifier && "Already at tok::identifier");
> 169     TokenID = tok::identifier;
> 170     RevertedTokenID = true;
> 171   }
> 
> When called from the AST deserialization code here:
> 
> lib/Serialization/ASTReader.cpp:
> 461 IdentifierInfo *ASTIdentifierLookupTrait::ReadData(const internal_key_type& k,
> 462                                                    const unsigned char* d,
> 463                                                    unsigned DataLen) {
> ...
> 487   unsigned Bits = ReadUnalignedLE16(d);
> ...
> 490   bool HasRevertedTokenIDToIdentifier = Bits & 0x01;
> ...
> 502   // Build the IdentifierInfo itself and link the identifier ID with
> 503   // the new IdentifierInfo.
> 504   IdentifierInfo *II = KnownII;
> 505   if (!II) {
> 506     II = &Reader.getIdentifierTable().getOwn(StringRef(k.first, k.second));
> 507     KnownII = II;
> 508   }
> 509   Reader.markIdentifierUpToDate(II);
> 510   II->setIsFromAST();
> 511
> 512   // Set or check the various bits in the IdentifierInfo structure.
> 513   // Token IDs are read-only.
> 514   if (HasRevertedTokenIDToIdentifier)
> 515     II->RevertTokenIDToIdentifier();
> ...
> 550 }
> 
> At line 515, the code is attempting to restore the RevertedTokenID field for the IdentifierInfo instance by calling RevertTokenIDToIdentifier(), but the code then asserts because the token kind (TokenID) already equals tok::identifier.
> 
> The corresponding serialization code is here:
> 
> lib/Serialization/ASTWriter.cpp:
> 2658 class ASTIdentifierTableTrait {
> ....
> 2741   void EmitData(raw_ostream& Out, IdentifierInfo* II,
> 2742                 IdentID ID, unsigned) {
> ....
> 2750     uint32_t Bits = (uint32_t)II->getObjCOrBuiltinID();
> ....
> 2758     Bits = (Bits << 1) | unsigned(II->hasRevertedTokenIDToIdentifier());
> ....
> 2760     clang::io::Emit16(Out, Bits);
> ....
> 2784   }
> 2785 };
> 
> Line 1131 and 1132 below contain the calls to revert the token ID and set the token kind to tok::identifier when a keyword is used as a struct name.  I suspect this is what sets the stage for the later assert when deserializing the AST, but I haven't debugged further.
> 
> lib/Parse/ParseDeclCXX.cpp:
> 1049 void Parser::ParseClassSpecifier(tok::TokenKind TagTokKind,
> 1050                                  SourceLocation StartLoc, DeclSpec &DS,
> 1051                                  const ParsedTemplateInfo &TemplateInfo,
> 1052                                  AccessSpecifier AS,
> 1053                                  bool EnteringContext, DeclSpecContext DSC) {
> ....
> 1107   if (TagType == DeclSpec::TST_struct &&
> 1108       !Tok.is(tok::identifier) &&
> 1109       Tok.getIdentifierInfo() &&
> 1110       (Tok.is(tok::kw___is_arithmetic) ||
> ....
> 1125        Tok.is(tok::kw___is_void))) {
> 1126     // GNU libstdc++ 4.2 and libc++ use certain intrinsic names as the
> 1127     // name of struct templates, but some are keywords in GCC >= 4.3
> 1128     // and Clang. Therefore, when we see the token sequence "struct
> 1129     // X", make X into a normal identifier rather than a keyword, to
> 1130     // allow libstdc++ 4.2 and libc++ to work properly.
> 1131     Tok.getIdentifierInfo()->RevertTokenIDToIdentifier();
> 1132     Tok.setKind(tok::identifier);
> 1133   }
> ....
> 1501 }
> 
> The problem might also be that the IdentifierInfo constructor initializes TokenID to tok::identifier by default:
> 
> lib/Basic/IdentifierTable.cpp:
> 31 IdentifierInfo::IdentifierInfo() {
> 32   TokenID = tok::identifier;
> ..
> 48 }
> 
> It isn't clear to me what the preferred fix for this would be.  Options include:
> 
> 1) Remove the assert.
> 
> 2) Change the default initialization of TokenID in the IdentifierInfo constructor from tok::identifier to tok::unknown and force all instances to be explicitly initialized.
> 
> 3) Modify ASTIdentifierLookupTrait::ReadData() above to force the TokenID value to something other than tok::identifier before calling RevertTokenIDToIdentifier().
> 
> 4) Others?
> 
> Tom.
> 
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev





More information about the cfe-dev mailing list