[cfe-commits] [PATCH] GCC Preprocessor Assertions revision 2

Andrew Craik andrew.craik at oracle.com
Thu Jan 12 18:19:38 PST 2012


*bump* Was just wondering if anyone had had a look at this patch as I 
don't have commit access.

Cheers,
Andrew

On 15-Dec-11 15:26, Andrew Craik wrote:
> Hi Doug,
>
> Please find attached a series of 4 patches which have incorporated 
> your feedback on my previous attempt to implement support for 
> GCC-style pre-processor assertions in Clang; the feature is now 
> working with PCH and PCH chaining. The patches are as follows from 
> first to last:
> 1) boost_usuallytinyptrvector_to_basic.patch - moves the 
> UsuallyPtrVector type to Basic, per your previous, and adds an erase 
> method to it
> 2) pp_assert_impl.patch - reworked version of the original patch to 
> implement handling of asserts in the pre-processor
> 3) pp_assert_defaults.patch - same as previous post adding the default 
> asserts for systems and cpus
> 4) pp_assert_serialization.patch - adds serialization support for 
> assertions and all the required callbacks etc
>
> For the serialization I had to add a bit to the IdentifierInfo to 
> track if the identifier had asserted values to facilitate separate 
> deserialization of asserts and macros. This resulted in the size of 
> the emitted IdentifierInfo exceeding the 2 bytes that were allocated 
> for it so I have increased the size as you can see in the patch.
>
> I look forward to hearing from you.
>
> Andrew
>
> On 04-Dec-11 08:33, Andrew Craik wrote:
>> Hi Doug,
>>
>> Thanks for the detailed feedback on the code style and control flow; all
>> your comments seem very reasonable and I will go and rework the patch. I
>> will see if I can add another patch to do the
>> serialization/de-serialization.
>>
>> Andrew
>>
>> On 04-Dec-11 06:32, Douglas Gregor wrote:
>>> On Dec 1, 2011, at 11:46 PM, Andrew Craik wrote:
>>>
>>>> Hello,
>>>>
>>>> At the LLVM Developers' Meeting a couple of weeks ago I mentioned 
>>>> that I had a patch to add support to the pre-processor to handle 
>>>> GCC-style pre-processor assertions.
>>>> Please find attached two patches:
>>>> pp_assert_impl.patch - this adds handling of assertions to the 
>>>> pre-processor
>>>> pp_assert_default.patch - this adds default cpu and system 
>>>> assertions to Targets.cpp
>>>>
>>>> I look forward to hearing from the community and hopefully getting 
>>>> these patches integrated.
>>> Thanks. A bunch of comments below, but generally things are look 
>>> very good!
>>>
>>> Starting with pp_assert_impl.patch:
>>>
>>> Index: clang/include/clang/Lex/Preprocessor.h
>>> ===================================================================
>>> --- clang.orig/include/clang/Lex/Preprocessor.h    2011-11-30 
>>> 11:39:19.000000000 +1000
>>> +++ clang/include/clang/Lex/Preprocessor.h    2011-12-02 
>>> 12:23:53.246940900 +1000
>>> @@ -15,6 +15,7 @@
>>>    #define LLVM_CLANG_LEX_PREPROCESSOR_H
>>>
>>>    #include "clang/Lex/MacroInfo.h"
>>> +#include "clang/Lex/AssertionInfo.h"
>>>    #include "clang/Lex/Lexer.h"
>>>    #include "clang/Lex/PTHLexer.h"
>>>    #include "clang/Lex/PPCallbacks.h"
>>> @@ -240,6 +241,10 @@
>>>      /// to the actual definition of the macro.
>>>      llvm::DenseMap<IdentifierInfo*, MacroInfo*>   Macros;
>>>
>>> +  /// Assertions - For each IdentifierInfo that has appeared as an 
>>> asserted
>>> +  /// predicate, we store the values for which it was asserted
>>> +  llvm::DenseMap<IdentifierInfo*, std::vector<AssertionInfo*> >   
>>> AssertedValues;
>>>
>>> I'm guessing that the common case is that there is only one asserted 
>>> value for each assertion, so I suggest replacing the std::vector 
>>> with a UsuallyTinyPtrVector (which will need to be boosted from AST 
>>> to Basic).
>>>
>>> @@ -440,6 +459,30 @@
>>>      macro_iterator macro_begin(bool IncludeExternalMacros = true) 
>>> const;
>>>      macro_iterator macro_end(bool IncludeExternalMacros = true) const;
>>>
>>> +  /// AddAssertionValue - assert the specified identifier to the
>>> +  /// set of asserted values for the given assertion key
>>> +  void AddAssertionValue(IdentifierInfo *II, AssertionInfo *AI);
>>> +
>>> +  /// IsAsserted - test if the specified identifier has been 
>>> asserted as a
>>> +  /// value of the given assertion key returns true iff it has
>>> +  bool IsAsserted(IdentifierInfo *II, AssertionInfo *AI);
>>> +
>>> +  /// RemoveAssertionValue - remove the specified identifier from 
>>> the set of
>>> +  /// asserted values for the given assertion key
>>> +  void RemoveAssertionValue(IdentifierInfo *II, AssertionInfo *AI);
>>> +
>>> +  /// RemoveAssertion - remove the specified assertion and all of 
>>> its asserted
>>> +  /// values
>>> +  void RemoveAssertion(IdentifierInfo *II);
>>>
>>> A quick request here... the LLVM coding standards clarified the 
>>> naming of functions a few months ago 
>>> (http://llvm.org/docs/CodingStandards.html#ll_naming), so please 
>>> name these addAssertionValue, isAsserted, and so on.
>>> The same comment applies throughout the patch, because we want new 
>>> code to follow this convention and existing code to eventually 
>>> migrate over.
>>>
>>>
>>> +    void Destory() {
>>> +      ValueTokens.clear();
>>> +      this->~AssertionInfo();
>>> +    }
>>>
>>> Typo "Destory", propagated throughout.
>>>
>>> +/// ReadAssertionPredicate - Lex and validate an assertion 
>>> predicate, which
>>> +/// occurs after an #assert or #unassert. This sets the token kind 
>>> to eod and
>>> +/// discards the rest of the line if the assert is invalid.
>>> +/// isAssertUnassert is 1 if this is due to an #assert, 2 if #unassert
>>> +/// directive, 0 if it is something else
>>> +void Preprocessor::ReadAssertionPredicate(Token&AssertionPredTok,
>>> +    char isAssertUnassert) {
>>>
>>> Please use an enumeration type rather than 0/1/2 here.
>>>
>>> Also, why not indicate failure by returning 'true' and success by 
>>> returning 'false', as we do elsewhere in Clang?
>>>
>>> +  IdentifierInfo *II = AssertionPredTok.getIdentifierInfo();
>>> +  if (II == 0) {
>>> +    bool Invalid = false;
>>> +    std::string Spelling = getSpelling(AssertionPredTok,&Invalid);
>>> +    if (Invalid) {
>>> +      return;
>>> +    }
>>> +
>>> +    const IdentifierInfo&Info = Identifiers.get(Spelling);
>>> +    if (Info.isCPlusPlusOperatorKeyword()) {
>>> +      Diag(AssertionPredTok, 
>>> diag::err_pp_operator_used_as_assert_predicate)
>>> +<<   Spelling;
>>> +    } else {
>>> +      Diag(AssertionPredTok, diag::err_pp_predicate_not_identifier);
>>> +    }
>>> +    // Fall through on error.
>>> +  } else if (isAssertUnassert&&   II->getPPKeywordID() == 
>>> tok::pp_assert) {
>>> +    // Error if using assert as a predicate since using it would 
>>> require
>>> +    // #assert
>>> +    Diag(AssertionPredTok, diag::err_assert_predicate_name);
>>> +  } else {
>>> +    // Okay, we got a good identifier node. Return it.
>>> +    return;
>>> +  }
>>>
>>> If you did switch the return value to 'bool', this code could be 
>>> de-nested and clarified a bit:
>>>
>>> if (IdentifierInfo *II = AssertionPredTok.getIdentifierInfo()) {
>>>     if (isAssertUnassert&&   II->getPPKeywordID() == tok::pp_assert)
>>>       return Diag(AssertionPredTok, diag::err_assert_predicate_name);
>>>
>>>     return false;
>>> }
>>>
>>> // diagnose non-identifiers.
>>>
>>> +//===----------------------------------------------------------------------===// 
>>>
>>> +// Preprocessor Assertion Directive Handling.
>>> +//===----------------------------------------------------------------------===// 
>>>
>>> +void Preprocessor::HandleAssertDirective(Token&DTok) {
>>> +  Token AssertionPredicateTok;
>>> +  ReadAssertionPredicate(AssertionPredicateTok, 1);
>>> +
>>> +  if (AssertionPredicateTok.is(tok::eod)) {
>>> +    return;
>>> +  }
>>> +
>>> +  AssertionInfo *AI =
>>> +    AllocateAssertionInfo(AssertionPredicateTok.getLocation());
>>> +  Token Tok;
>>> +  LexUnexpandedToken(Tok);
>>> +  ReadAssertionValue(Tok, AI);
>>> +
>>> +  // add to AI
>>> +  if (AI != 0) {
>>>
>>> I find this interface to ReadAssertionValue to be a bit odd. How 
>>> about having ReadAssertionValue lex all of the tokens it needs to, 
>>> then return an allocated AssertionInfo* on success or null to 
>>> indicate failure? It would simplify control flow for its callers.
>>>
>>> +void Preprocessor::HandleUnassertDirective(Token&DTok) {
>>> +  Token AssertionPredicateTok;
>>> +  ReadAssertionPredicate(AssertionPredicateTok, 2);
>>> +
>>> +  // check to see if we read eod which indicates an error has 
>>> occurred and
>>> +  // an appropriate diagnostic has already been generated
>>> +  if (AssertionPredicateTok.is(tok::eod)) {
>>> +    return;
>>> +  }
>>> +
>>> +  Token Tok;
>>> +  LexUnexpandedToken(Tok);
>>> +
>>> +  IdentifierInfo *II = AssertionPredicateTok.getIdentifierInfo();
>>> +
>>> +  // we have a valid identifier so we need to create an 
>>> AssertionInfo to hold
>>> +  // the value to associate with the predicate
>>> +  assertion_iterator iter = AssertedValues.find(II);
>>> +
>>> +  // we don't have an entry in the assertion table for the 
>>> predicate, this is a
>>> +  // noop so escape early
>>> +  if (iter == assertion_end()) {
>>> +    return;
>>> +  }
>>>
>>> Shouldn't we still parse the rest of the #unassert to make sure it's 
>>> well-formed? Or, at the very least, skip to the end of the directive?
>>>
>>> +void Preprocessor::ReadAssertionValue(Token&Tok, AssertionInfo *AI) {
>>> +  if (Tok.isNot(tok::l_paren)) {
>>> +    // emit diag
>>> +    ReleaseAssertionInfo(AI);
>>> +    AI = 0;
>>> +    return DiscardUntilEndOfDirective();
>>> +  }
>>>
>>> You're missing the diagnostic here.
>>>
>>> +  while (true) {
>>> +    LexUnexpandedToken(Tok);
>>> +    if (Tok.is(tok::r_paren) || Tok.is(tok::eod)) {
>>> +      break;
>>> +    }
>>> +    AI->AddToken(Tok);
>>> +  }
>>>
>>> Could you add a comment saying that it's intentional that 
>>> parentheses don't nest?It surprised me, but I see that's the 
>>> documented behavior, and I don't want someone coming through later 
>>> and "fixing" it.
>>>
>>> +  // check for the closing ), if we don't have it relase the 
>>> AssetionInfo and
>>> +  // discard the rest of the directive
>>> +  if (!Tok.is(tok::r_paren)) {
>>> +    ReleaseAssertionInfo(AI);
>>> +    AI = 0;
>>> +    return DiscardUntilEndOfDirective();
>>> +  }
>>>
>>> Comment typo "relase". Also, shouldn't we emit a diagnostic here?
>>>
>>> +  // check there is nothing after the closing ), if we have 
>>> anything release
>>> +  // the Assertioninfo and idscard the rest of the directive
>>> +  LexUnexpandedToken(Tok);
>>> +  if (!Tok.is(tok::eod)) {
>>> +    ReleaseAssertionInfo(AI);
>>> +    AI = 0;
>>> +    return DiscardUntilEndOfDirective();
>>> +  }
>>>
>>> Diagnostic here?
>>>
>>> +    while (true) {
>>> +      PP.LexUnexpandedToken(PeekTok);
>>> +      if (PeekTok.is(tok::r_paren) || PeekTok.is(tok::eod)) {
>>> +        break;
>>> +      }
>>> +      AI->AddToken(PeekTok);
>>> +    }
>>>
>>> Same request for a comment about non-nesting parens.
>>>
>>> +    // check for the closing ), if we don't have it relase the 
>>> AssertionInfo
>>> +    // and discard the rest of the directive
>>> +    if (!PeekTok.is(tok::r_paren)) {
>>> +      PP.Diag(PredicateTok, diag::err_pp_expr_bad_token_start_expr);
>>> +      PP.ReleaseAssertionInfo(AI);
>>> +      AI = 0;
>>> +      return true;
>>> +    }
>>>
>>> "relase" typo in comment.
>>>
>>> This patch is looking really good. One non-trivial piece that's 
>>> still missing is that the AssertedValues map will need to be 
>>> serialized to the AST file and (lazily) deserialized, so that this 
>>> feature works with precompiled headers.
>>>
>>> pp_assert_default.patch looks good to me.
>>>
>>> Thanks for working on this!
>>>
>>>     - Doug
>> _______________________________________________
>> cfe-commits mailing list
>> cfe-commits at cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits
>
>
> _______________________________________________
> cfe-commits mailing list
> cfe-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20120113/9b43386a/attachment.html>


More information about the cfe-commits mailing list