[cfe-commits] [PATCH] GCC Preprocessor Assertions

Douglas Gregor dgregor at apple.com
Sat Dec 3 12:32:40 PST 2011


On Dec 1, 2011, at 11:46 PM, Andrew Craik wrote:

> Hello,
> 
> At the LLVM Developers' Meeting a couple of weeks ago I mentioned that I had a patch to add support to the pre-processor to handle GCC-style pre-processor assertions.

> Please find attached two patches:
> pp_assert_impl.patch - this adds handling of assertions to the pre-processor
> pp_assert_default.patch - this adds default cpu and system assertions to Targets.cpp
> 
> I look forward to hearing from the community and hopefully getting these patches integrated.


Thanks. A bunch of comments below, but generally things are look very good!

Starting with pp_assert_impl.patch:

Index: clang/include/clang/Lex/Preprocessor.h
===================================================================
--- clang.orig/include/clang/Lex/Preprocessor.h	2011-11-30 11:39:19.000000000 +1000
+++ clang/include/clang/Lex/Preprocessor.h	2011-12-02 12:23:53.246940900 +1000
@@ -15,6 +15,7 @@
 #define LLVM_CLANG_LEX_PREPROCESSOR_H
 
 #include "clang/Lex/MacroInfo.h"
+#include "clang/Lex/AssertionInfo.h"
 #include "clang/Lex/Lexer.h"
 #include "clang/Lex/PTHLexer.h"
 #include "clang/Lex/PPCallbacks.h"
@@ -240,6 +241,10 @@
   /// to the actual definition of the macro.
   llvm::DenseMap<IdentifierInfo*, MacroInfo*> Macros;
 
+  /// Assertions - For each IdentifierInfo that has appeared as an asserted
+  /// predicate, we store the values for which it was asserted
+  llvm::DenseMap<IdentifierInfo*, std::vector<AssertionInfo*> > AssertedValues;

I'm guessing that the common case is that there is only one asserted value for each assertion, so I suggest replacing the std::vector with a UsuallyTinyPtrVector (which will need to be boosted from AST to Basic).

@@ -440,6 +459,30 @@
   macro_iterator macro_begin(bool IncludeExternalMacros = true) const;
   macro_iterator macro_end(bool IncludeExternalMacros = true) const;
 
+  /// AddAssertionValue - assert the specified identifier to the
+  /// set of asserted values for the given assertion key
+  void AddAssertionValue(IdentifierInfo *II, AssertionInfo *AI);
+
+  /// IsAsserted - test if the specified identifier has been asserted as a
+  /// value of the given assertion key returns true iff it has
+  bool IsAsserted(IdentifierInfo *II, AssertionInfo *AI);
+
+  /// RemoveAssertionValue - remove the specified identifier from the set of
+  /// asserted values for the given assertion key
+  void RemoveAssertionValue(IdentifierInfo *II, AssertionInfo *AI);
+
+  /// RemoveAssertion - remove the specified assertion and all of its asserted
+  /// values
+  void RemoveAssertion(IdentifierInfo *II);

A quick request hereā€¦ the LLVM coding standards clarified the naming of functions a few months ago (http://llvm.org/docs/CodingStandards.html#ll_naming), so please name these addAssertionValue, isAsserted, and so on. 
The same comment applies throughout the patch, because we want new code to follow this convention and existing code to eventually migrate over.


+    void Destory() {
+      ValueTokens.clear();
+      this->~AssertionInfo();
+    }

Typo "Destory", propagated throughout.

+/// ReadAssertionPredicate - Lex and validate an assertion predicate, which
+/// occurs after an #assert or #unassert. This sets the token kind to eod and
+/// discards the rest of the line if the assert is invalid.
+/// isAssertUnassert is 1 if this is due to an #assert, 2 if #unassert
+/// directive, 0 if it is something else
+void Preprocessor::ReadAssertionPredicate(Token &AssertionPredTok,
+    char isAssertUnassert) {

Please use an enumeration type rather than 0/1/2 here.

Also, why not indicate failure by returning 'true' and success by returning 'false', as we do elsewhere in Clang?

+  IdentifierInfo *II = AssertionPredTok.getIdentifierInfo();
+  if (II == 0) {
+    bool Invalid = false;
+    std::string Spelling = getSpelling(AssertionPredTok, &Invalid);
+    if (Invalid) {
+      return;
+    }
+
+    const IdentifierInfo &Info = Identifiers.get(Spelling);
+    if (Info.isCPlusPlusOperatorKeyword()) {
+      Diag(AssertionPredTok, diag::err_pp_operator_used_as_assert_predicate)
+        << Spelling;
+    } else {
+      Diag(AssertionPredTok, diag::err_pp_predicate_not_identifier);
+    }
+    // Fall through on error.
+  } else if (isAssertUnassert && II->getPPKeywordID() == tok::pp_assert) {
+    // Error if using assert as a predicate since using it would require
+    // #assert
+    Diag(AssertionPredTok, diag::err_assert_predicate_name);
+  } else {
+    // Okay, we got a good identifier node. Return it.
+    return;
+  }

If you did switch the return value to 'bool', this code could be de-nested and clarified a bit:

if (IdentifierInfo *II = AssertionPredTok.getIdentifierInfo()) {
  if (isAssertUnassert && II->getPPKeywordID() == tok::pp_assert)
    return Diag(AssertionPredTok, diag::err_assert_predicate_name);

  return false;
}

// diagnose non-identifiers.

+//===----------------------------------------------------------------------===//
+// Preprocessor Assertion Directive Handling.
+//===----------------------------------------------------------------------===//
+void Preprocessor::HandleAssertDirective(Token &DTok) {
+  Token AssertionPredicateTok;
+  ReadAssertionPredicate(AssertionPredicateTok, 1);
+
+  if (AssertionPredicateTok.is(tok::eod)) {
+    return;
+  }
+
+  AssertionInfo *AI =
+    AllocateAssertionInfo(AssertionPredicateTok.getLocation());
+  Token Tok;
+  LexUnexpandedToken(Tok);
+  ReadAssertionValue(Tok, AI);
+
+  // add to AI
+  if (AI != 0) {

I find this interface to ReadAssertionValue to be a bit odd. How about having ReadAssertionValue lex all of the tokens it needs to, then return an allocated AssertionInfo* on success or null to indicate failure? It would simplify control flow for its callers.

+void Preprocessor::HandleUnassertDirective(Token &DTok) {
+  Token AssertionPredicateTok;
+  ReadAssertionPredicate(AssertionPredicateTok, 2);
+
+  // check to see if we read eod which indicates an error has occurred and
+  // an appropriate diagnostic has already been generated
+  if (AssertionPredicateTok.is(tok::eod)) {
+    return;
+  }
+
+  Token Tok;
+  LexUnexpandedToken(Tok);
+
+  IdentifierInfo *II = AssertionPredicateTok.getIdentifierInfo();
+
+  // we have a valid identifier so we need to create an AssertionInfo to hold
+  // the value to associate with the predicate
+  assertion_iterator iter = AssertedValues.find(II);
+
+  // we don't have an entry in the assertion table for the predicate, this is a
+  // noop so escape early
+  if (iter == assertion_end()) {
+    return;
+  }

Shouldn't we still parse the rest of the #unassert to make sure it's well-formed? Or, at the very least, skip to the end of the directive?

+void Preprocessor::ReadAssertionValue(Token &Tok, AssertionInfo *AI) {
+  if (Tok.isNot(tok::l_paren)) {
+    // emit diag
+    ReleaseAssertionInfo(AI);
+    AI = 0;
+    return DiscardUntilEndOfDirective();
+  }

You're missing the diagnostic here.

+  while (true) {
+    LexUnexpandedToken(Tok);
+    if (Tok.is(tok::r_paren) || Tok.is(tok::eod)) {
+      break;
+    }
+    AI->AddToken(Tok);
+  }

Could you add a comment saying that it's intentional that parentheses don't nest?It surprised me, but I see that's the documented behavior, and I don't want someone coming through later and "fixing" it.

+  // check for the closing ), if we don't have it relase the AssetionInfo and
+  // discard the rest of the directive
+  if (!Tok.is(tok::r_paren)) {
+    ReleaseAssertionInfo(AI);
+    AI = 0;
+    return DiscardUntilEndOfDirective();
+  }

Comment typo "relase". Also, shouldn't we emit a diagnostic here?

+  // check there is nothing after the closing ), if we have anything release
+  // the Assertioninfo and idscard the rest of the directive
+  LexUnexpandedToken(Tok);
+  if (!Tok.is(tok::eod)) {
+    ReleaseAssertionInfo(AI);
+    AI = 0;
+    return DiscardUntilEndOfDirective();
+  }

Diagnostic here?

+    while (true) {
+      PP.LexUnexpandedToken(PeekTok);
+      if (PeekTok.is(tok::r_paren) || PeekTok.is(tok::eod)) {
+        break;
+      }
+      AI->AddToken(PeekTok);
+    }

Same request for a comment about non-nesting parens.

+    // check for the closing ), if we don't have it relase the AssertionInfo
+    // and discard the rest of the directive
+    if (!PeekTok.is(tok::r_paren)) {
+      PP.Diag(PredicateTok, diag::err_pp_expr_bad_token_start_expr);
+      PP.ReleaseAssertionInfo(AI);
+      AI = 0;
+      return true;
+    }

"relase" typo in comment.

This patch is looking really good. One non-trivial piece that's still missing is that the AssertedValues map will need to be serialized to the AST file and (lazily) deserialized, so that this feature works with precompiled headers.

pp_assert_default.patch looks good to me.

Thanks for working on this! 

	- Doug



More information about the cfe-commits mailing list