[cfe-dev] [PATCH] GCC Preprocessor Assertions revision 2

Richard Smith richard at metafoo.co.uk
Tue Mar 6 11:31:53 PST 2012


>From a cursory investigation of this feature, it looks like it should be
straightforward to write a tiny awk script to convert any users of it,
which don't use #unassert, to macros:

#assert a(b) -> #define a 1   #define a_b 1
#a (inside pp condition) -> defined(a)
#a(b) (inside pp condition) -> defined(a_b)

Given that, and your analysis of the costs, I think we would need quite a
compelling reason to accept this extension. Unless someone can present a
solid argument in favor of this extension, it does not seem valuable to me.

On Tue, Mar 6, 2012 at 9:21 AM, Argyrios Kyrtzidis <kyrtzidis at apple.com>wrote:

> Hi all,
>
> I'd like to get feedback and consensus on whether we should add "GCC
> Preprocessor Assertions" to clang. This gcc extension has been deprecated
> since gcc-3, here's a quote from gcc 3.1 docs, (bold added by me):
>
> From http://gcc.gnu.org/onlinedocs/gcc-3.1/cpp/Assertions.html
>
> Assertions are a *deprecated* alternative to macros in writing
> conditionals to test what sort of computer or system the compiled program
> will run on. Assertions are usually predefined, but you can define them
> with preprocessing directives or command-line options.
>
> Assertions were intended to provide a more systematic way to describe the
> compiler's target system. However, in practice they are just as
> unpredictable as the system-specific predefined macros. In addition, they
> are not part of any standard, and only a few compilers support them.
> Therefore, the use of assertions is less portable than the use of
> system-specific predefined macros. *We recommend you do not use them at
> all*.
>
>
> Do note that, after checking on http://ideone.com, gcc still supports
> them on gcc-4.3.4.
>
> Andrew did great work on implementing the feature; apart from the
> preprocessor changes, he added support for them in the preprocessing record
> and serialization/deserialization support in the ASTReader/ASTWriter.
> This is a significant amount of code that we have to maintain and make
> sure it works well with other parts of the codebase, and I'm worried
> about the maintenance burden that this deprecated feature will impose, e.g.
>
> -it reserves 1 bit out of the IdentifierInfo object. There's only 1 bit
> left currently so afterwards there will be non left to easily use for other
> purposes, without doubling the IdentifierInfo object's size.
>
> -it increases the size of the IdentifierInfo data that are stored in the
> serialized AST.
>
> Although the changes are extensive, there may be more that are needed:
>
> -gcc docs say that you can specify assertions by command-line options; I
> did not see this implemented in Andrew's patches
>
> -the feature was not integrated into our PCH validation, to be more
> specific, validating and giving errors when assertions from PCH clash with
> predefined assertions or assertions from the command-line
>
> -It is not clear to me how this feature should interact with modules.
>
>
> So, to recap, the question is this, should we implement and maintain a gcc
> extension that was deprecated since gcc 3 ?
>
> -Argyrios
>
>
> On Dec 14, 2011, at 9:26 PM, Andrew Craik wrote:
>
> Hi Doug,
>
> Please find attached a series of 4 patches which have incorporated your
> feedback on my previous attempt to implement support for GCC-style
> pre-processor assertions in Clang; the feature is now working with PCH and
> PCH chaining. The patches are as follows from first to last:
> 1) boost_usuallytinyptrvector_to_basic.patch - moves the UsuallyPtrVector
> type to Basic, per your previous, and adds an erase method to it
> 2) pp_assert_impl.patch - reworked version of the original patch to
> implement handling of asserts in the pre-processor
> 3) pp_assert_defaults.patch - same as previous post adding the default
> asserts for systems and cpus
> 4) pp_assert_serialization.patch - adds serialization support for
> assertions and all the required callbacks etc
>
> For the serialization I had to add a bit to the IdentifierInfo to track if
> the identifier had asserted values to facilitate separate deserialization
> of asserts and macros. This resulted in the size of the emitted
> IdentifierInfo exceeding the 2 bytes that were allocated for it so I have
> increased the size as you can see in the patch.
>
> I look forward to hearing from you.
>
> Andrew
>
> On 04-Dec-11 08:33, Andrew Craik wrote:
>
> Hi Doug,
>
>
> Thanks for the detailed feedback on the code style and control flow; all
>
> your comments seem very reasonable and I will go and rework the patch. I
>
> will see if I can add another patch to do the
>
> serialization/de-serialization.
>
>
> Andrew
>
>
> On 04-Dec-11 06:32, Douglas Gregor wrote:
>
> On Dec 1, 2011, at 11:46 PM, Andrew Craik wrote:
>
>
> Hello,
>
>
> At the LLVM Developers' Meeting a couple of weeks ago I mentioned that I
> had a patch to add support to the pre-processor to handle GCC-style
> pre-processor assertions.
>
> Please find attached two patches:
>
> pp_assert_impl.patch - this adds handling of assertions to the
> pre-processor
>
> pp_assert_default.patch - this adds default cpu and system assertions to
> Targets.cpp
>
>
> I look forward to hearing from the community and hopefully getting these
> patches integrated.
>
> Thanks. A bunch of comments below, but generally things are look very good!
>
>
> Starting with pp_assert_impl.patch:
>
>
> Index: clang/include/clang/Lex/Preprocessor.h
>
> ===================================================================
>
> --- clang.orig/include/clang/Lex/Preprocessor.h 2011-11-30
> 11:39:19.000000000 +1000
>
> +++ clang/include/clang/Lex/Preprocessor.h 2011-12-02 12:23:53.246940900
> +1000
>
> @@ -15,6 +15,7 @@
>
>   #define LLVM_CLANG_LEX_PREPROCESSOR_H
>
>
>   #include "clang/Lex/MacroInfo.h"
>
> +#include "clang/Lex/AssertionInfo.h"
>
>   #include "clang/Lex/Lexer.h"
>
>   #include "clang/Lex/PTHLexer.h"
>
>   #include "clang/Lex/PPCallbacks.h"
>
> @@ -240,6 +241,10 @@
>
>     /// to the actual definition of the macro.
>
>     llvm::DenseMap<IdentifierInfo*, MacroInfo*>   Macros;
>
>
> +  /// Assertions - For each IdentifierInfo that has appeared as an
> asserted
>
> +  /// predicate, we store the values for which it was asserted
>
> +  llvm::DenseMap<IdentifierInfo*, std::vector<AssertionInfo*>   >
>   AssertedValues;
>
>
> I'm guessing that the common case is that there is only one asserted value
> for each assertion, so I suggest replacing the std::vector with a
> UsuallyTinyPtrVector (which will need to be boosted from AST to Basic).
>
>
> @@ -440,6 +459,30 @@
>
>     macro_iterator macro_begin(bool IncludeExternalMacros = true) const;
>
>     macro_iterator macro_end(bool IncludeExternalMacros = true) const;
>
>
> +  /// AddAssertionValue - assert the specified identifier to the
>
> +  /// set of asserted values for the given assertion key
>
> +  void AddAssertionValue(IdentifierInfo *II, AssertionInfo *AI);
>
> +
>
> +  /// IsAsserted - test if the specified identifier has been asserted as a
>
> +  /// value of the given assertion key returns true iff it has
>
> +  bool IsAsserted(IdentifierInfo *II, AssertionInfo *AI);
>
> +
>
> +  /// RemoveAssertionValue - remove the specified identifier from the set
> of
>
> +  /// asserted values for the given assertion key
>
> +  void RemoveAssertionValue(IdentifierInfo *II, AssertionInfo *AI);
>
> +
>
> +  /// RemoveAssertion - remove the specified assertion and all of its
> asserted
>
> +  /// values
>
> +  void RemoveAssertion(IdentifierInfo *II);
>
>
> A quick request hereā€¦ the LLVM coding standards clarified the naming of
> functions a few months ago (
> http://llvm.org/docs/CodingStandards.html#ll_naming), so please name
> these addAssertionValue, isAsserted, and so on.
>
> The same comment applies throughout the patch, because we want new code to
> follow this convention and existing code to eventually migrate over.
>
>
>
> +    void Destory() {
>
> +      ValueTokens.clear();
>
> +      this->~AssertionInfo();
>
> +    }
>
>
> Typo "Destory", propagated throughout.
>
>
> +/// ReadAssertionPredicate - Lex and validate an assertion predicate,
> which
>
> +/// occurs after an #assert or #unassert. This sets the token kind to eod
> and
>
> +/// discards the rest of the line if the assert is invalid.
>
> +/// isAssertUnassert is 1 if this is due to an #assert, 2 if #unassert
>
> +/// directive, 0 if it is something else
>
> +void Preprocessor::ReadAssertionPredicate(Token&AssertionPredTok,
>
> +    char isAssertUnassert) {
>
>
> Please use an enumeration type rather than 0/1/2 here.
>
>
> Also, why not indicate failure by returning 'true' and success by
> returning 'false', as we do elsewhere in Clang?
>
>
> +  IdentifierInfo *II = AssertionPredTok.getIdentifierInfo();
>
> +  if (II == 0) {
>
> +    bool Invalid = false;
>
> +    std::string Spelling = getSpelling(AssertionPredTok,&Invalid);
>
> +    if (Invalid) {
>
> +      return;
>
> +    }
>
> +
>
> +    const IdentifierInfo&Info = Identifiers.get(Spelling);
>
> +    if (Info.isCPlusPlusOperatorKeyword()) {
>
> +      Diag(AssertionPredTok,
> diag::err_pp_operator_used_as_assert_predicate)
>
> +<<   Spelling;
>
> +    } else {
>
> +      Diag(AssertionPredTok, diag::err_pp_predicate_not_identifier);
>
> +    }
>
> +    // Fall through on error.
>
> +  } else if (isAssertUnassert&&   II->getPPKeywordID() == tok::pp_assert)
> {
>
> +    // Error if using assert as a predicate since using it would require
>
> +    // #assert
>
> +    Diag(AssertionPredTok, diag::err_assert_predicate_name);
>
> +  } else {
>
> +    // Okay, we got a good identifier node. Return it.
>
> +    return;
>
> +  }
>
>
> If you did switch the return value to 'bool', this code could be de-nested
> and clarified a bit:
>
>
> if (IdentifierInfo *II = AssertionPredTok.getIdentifierInfo()) {
>
>    if (isAssertUnassert&&   II->getPPKeywordID() == tok::pp_assert)
>
>      return Diag(AssertionPredTok, diag::err_assert_predicate_name);
>
>
>    return false;
>
> }
>
>
> // diagnose non-identifiers.
>
>
>
> +//===----------------------------------------------------------------------===//
>
> +// Preprocessor Assertion Directive Handling.
>
>
> +//===----------------------------------------------------------------------===//
>
> +void Preprocessor::HandleAssertDirective(Token&DTok) {
>
> +  Token AssertionPredicateTok;
>
> +  ReadAssertionPredicate(AssertionPredicateTok, 1);
>
> +
>
> +  if (AssertionPredicateTok.is(tok::eod)) {
>
> +    return;
>
> +  }
>
> +
>
> +  AssertionInfo *AI =
>
> +    AllocateAssertionInfo(AssertionPredicateTok.getLocation());
>
> +  Token Tok;
>
> +  LexUnexpandedToken(Tok);
>
> +  ReadAssertionValue(Tok, AI);
>
> +
>
> +  // add to AI
>
> +  if (AI != 0) {
>
>
> I find this interface to ReadAssertionValue to be a bit odd. How about
> having ReadAssertionValue lex all of the tokens it needs to, then return an
> allocated AssertionInfo* on success or null to indicate failure? It would
> simplify control flow for its callers.
>
>
> +void Preprocessor::HandleUnassertDirective(Token&DTok) {
>
> +  Token AssertionPredicateTok;
>
> +  ReadAssertionPredicate(AssertionPredicateTok, 2);
>
> +
>
> +  // check to see if we read eod which indicates an error has occurred and
>
> +  // an appropriate diagnostic has already been generated
>
> +  if (AssertionPredicateTok.is(tok::eod)) {
>
> +    return;
>
> +  }
>
> +
>
> +  Token Tok;
>
> +  LexUnexpandedToken(Tok);
>
> +
>
> +  IdentifierInfo *II = AssertionPredicateTok.getIdentifierInfo();
>
> +
>
> +  // we have a valid identifier so we need to create an AssertionInfo to
> hold
>
> +  // the value to associate with the predicate
>
> +  assertion_iterator iter = AssertedValues.find(II);
>
> +
>
> +  // we don't have an entry in the assertion table for the predicate,
> this is a
>
> +  // noop so escape early
>
> +  if (iter == assertion_end()) {
>
> +    return;
>
> +  }
>
>
> Shouldn't we still parse the rest of the #unassert to make sure it's
> well-formed? Or, at the very least, skip to the end of the directive?
>
>
> +void Preprocessor::ReadAssertionValue(Token&Tok, AssertionInfo *AI) {
>
> +  if (Tok.isNot(tok::l_paren)) {
>
> +    // emit diag
>
> +    ReleaseAssertionInfo(AI);
>
> +    AI = 0;
>
> +    return DiscardUntilEndOfDirective();
>
> +  }
>
>
> You're missing the diagnostic here.
>
>
> +  while (true) {
>
> +    LexUnexpandedToken(Tok);
>
> +    if (Tok.is(tok::r_paren) || Tok.is(tok::eod)) {
>
> +      break;
>
> +    }
>
> +    AI->AddToken(Tok);
>
> +  }
>
>
> Could you add a comment saying that it's intentional that parentheses
> don't nest?It surprised me, but I see that's the documented behavior, and I
> don't want someone coming through later and "fixing" it.
>
>
> +  // check for the closing ), if we don't have it relase the AssetionInfo
> and
>
> +  // discard the rest of the directive
>
> +  if (!Tok.is(tok::r_paren)) {
>
> +    ReleaseAssertionInfo(AI);
>
> +    AI = 0;
>
> +    return DiscardUntilEndOfDirective();
>
> +  }
>
>
> Comment typo "relase". Also, shouldn't we emit a diagnostic here?
>
>
> +  // check there is nothing after the closing ), if we have anything
> release
>
> +  // the Assertioninfo and idscard the rest of the directive
>
> +  LexUnexpandedToken(Tok);
>
> +  if (!Tok.is(tok::eod)) {
>
> +    ReleaseAssertionInfo(AI);
>
> +    AI = 0;
>
> +    return DiscardUntilEndOfDirective();
>
> +  }
>
>
> Diagnostic here?
>
>
> +    while (true) {
>
> +      PP.LexUnexpandedToken(PeekTok);
>
> +      if (PeekTok.is(tok::r_paren) || PeekTok.is(tok::eod)) {
>
> +        break;
>
> +      }
>
> +      AI->AddToken(PeekTok);
>
> +    }
>
>
> Same request for a comment about non-nesting parens.
>
>
> +    // check for the closing ), if we don't have it relase the
> AssertionInfo
>
> +    // and discard the rest of the directive
>
> +    if (!PeekTok.is(tok::r_paren)) {
>
> +      PP.Diag(PredicateTok, diag::err_pp_expr_bad_token_start_expr);
>
> +      PP.ReleaseAssertionInfo(AI);
>
> +      AI = 0;
>
> +      return true;
>
> +    }
>
>
> "relase" typo in comment.
>
>
> This patch is looking really good. One non-trivial piece that's still
> missing is that the AssertedValues map will need to be serialized to the
> AST file and (lazily) deserialized, so that this feature works with
> precompiled headers.
>
>
> pp_assert_default.patch looks good to me.
>
>
> Thanks for working on this!
>
>
> - Doug
>
> _______________________________________________
>
> cfe-commits mailing list
>
> cfe-commits at cs.uiuc.edu
>
> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits
>
> <pp_assert_defaults.patch><pp_assert_impl.patch>
> <pp_assert_serialization.patch><boost_usuallytinyptrvector_to_basic.patch>
> _______________________________________________
> cfe-commits mailing list
> cfe-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits
>
>
>
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20120306/09131132/attachment.html>


More information about the cfe-dev mailing list