[cfe-dev] [PATCH] GCC Preprocessor Assertions revision 2

Andrew Craik andrew.craik at oracle.com
Tue Mar 6 21:27:06 PST 2012


Hi Argyrios,

Thank you for reviewing my patches.

I have replied to your comments inline below.

I ran into preprocessor assertions when I was trying to build an older 
version of Grub which is what prompted my implementation.

Kind regards,
Andrew

On 07-Mar-12 03:21, Argyrios Kyrtzidis wrote:
> Hi all,
>
> I'd like to get feedback and consensus on whether we should add "GCC 
> Preprocessor Assertions" to clang. This gcc extension has been 
> deprecated since gcc-3, here's a quote from gcc 3.1 docs, (bold added 
> by me):
>
> From http://gcc.gnu.org/onlinedocs/gcc-3.1/cpp/Assertions.html
>
>> Assertions are a *deprecated* alternative to macros in writing 
>> conditionals to test what sort of computer or system the compiled 
>> program will run on. Assertions are usually predefined, but you can 
>> define them with preprocessing directives or command-line options.
>>
>> Assertions were intended to provide a more systematic way to describe 
>> the compiler's target system. However, in practice they are just as 
>> unpredictable as the system-specific predefined macros. In addition, 
>> they are not part of any standard, and only a few compilers support 
>> them. Therefore, the use of assertions is less portable than the use 
>> of system-specific predefined macros. *We recommend you do not use 
>> them at all*.
>
> Do note that, after checking on http://ideone.com, gcc still supports 
> them on gcc-4.3.4.
>
> Andrew did great work on implementing the feature; apart from the 
> preprocessor changes, he added support for them in the preprocessing 
> record and serialization/deserialization support in the 
> ASTReader/ASTWriter.
> This is a significant amount of code that we have to maintain and make 
> sure it works well with other parts of the codebase, and I'm worried 
> about the maintenance burden that this deprecated feature will impose, 
> e.g.
>
> -it reserves 1 bit out of the IdentifierInfo object. There's only 1 
> bit left currently so afterwards there will be non left to easily use 
> for other purposes, without doubling the IdentifierInfo object's size.
If anyone has suggestions on how to avoid this, I am happy to consider 
reworking my patch to incorporate such an improvement.
>
> -it increases the size of the IdentifierInfo data that are stored in 
> the serialized AST.
I think this is unavoidable if the feature is going to be implemented, 
but again suggestions on how to improve things are most welcome.
>
> Although the changes are extensive, there may be more that are needed:
>
> -gcc docs say that you can specify assertions by command-line options; 
> I did not see this implemented in Andrew's patches
I am aware the command-line option (-A) to assert a key-value pair is 
not implemented in the patches I have submitted. I didn't write this 
support yet because clang uses -A for other things and I wasn't sure if 
that needed to be changed or if we were going to use a different flag - 
the actual patch will be relatively small. If the feature is accepted 
and guidance is provided as to what flag we should use I am happy to 
write and contribute this patch
>
> -the feature was not integrated into our PCH validation, to be more 
> specific, validating and giving errors when assertions from PCH clash 
> with predefined assertions or assertions from the command-line
I do not believe there is anything that needs to be validated in terms 
of PCH vs command-line vs source preprocessor assertions. You can assert 
multiple values for a given key quite correctly (unlike macros) and 
assertions are designed to be added and removed using all of these 
features and it is not wrong to do so. Further, asserting the same 
key-value pair twice is not an error and so I don't think there needs to 
be validation of clashes since, by design, these clashes are permitted.
>
> -It is not clear to me how this feature should interact with modules.
I am not sure how these would interact since the reference 
implementation, GCC, does not support modules in the versions I have 
tested. Doug, do you have any thoughts on this since you have worked on 
modules? Since the behavior is undefined we can probably do whatever is 
easiest and makes the most sense.
>
>
> So, to recap, the question is this, should we implement and maintain a 
> gcc extension that was deprecated since gcc 3 ?
I think clang should implement and maintain this feature because
1) It's widely supported by existing compilers including GCC 4.6.X 
(http://gcc.gnu.org/onlinedocs/gcc-4.6.3/gcc/Preprocessor-Options.html#Preprocessor-Options), 
Intel C/C++ (http://denali.princeton.edu/intel_cc_docs/c_ug/index.htm), 
and ARM's toolchain 
(http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0491c/CHDFDIDE.html)
2) Preprocessor assertions are an optional extension defined in System V 
Release 4 (SVR4) see the System V Interface manual Vol 3 page 270 
(http://www.sco.com/developers/devspecs/vol3_ps.ps) and so should be 
implemented to support SVR4 systems which have chosen to implement this 
feature.
3) It was clearly identified as a missing feature by FIXMEs in Chris' 
original commit of PPDirectives.cpp and currently the compiler produces 
only cryptic errors when this feature is encountered.
4) It is not possible to work around this feature without having to 
modify the source and doing so with an automated sed or awk script is 
risky and difficult to integrate into some build processes. I did 
consider trying to emulate assertions using macros, but I found this 
wasn't possible using only the preprocessor. Asserts and macros exist in 
different namespaces and so you can define and assert the same symbol. 
In many cases this is not a problem because all we want to do is test if 
a macro is defined, but if the value of the macro is important then a 
collision in the names would cause a problem. The assertion tests 
themselves would have to be found and rewritten using a sed or awk 
script which is a fragile hack rather than a proper fix and having to 
run this script before a compile would complicate the build process etc. 
This kind of workaround also assumes that the source can be modified 
prior to compilation.
5) Clang has publicly stated that it will aim for GCC compatibility 
where possible and these patches make compatibility with this feature 
possible.
>
> -Argyrios
>
>
> On Dec 14, 2011, at 9:26 PM, Andrew Craik wrote:
>
>> Hi Doug,
>>
>> Please find attached a series of 4 patches which have incorporated 
>> your feedback on my previous attempt to implement support for 
>> GCC-style pre-processor assertions in Clang; the feature is now 
>> working with PCH and PCH chaining. The patches are as follows from 
>> first to last:
>> 1) boost_usuallytinyptrvector_to_basic.patch - moves the 
>> UsuallyPtrVector type to Basic, per your previous, and adds an erase 
>> method to it
>> 2) pp_assert_impl.patch - reworked version of the original patch to 
>> implement handling of asserts in the pre-processor
>> 3) pp_assert_defaults.patch - same as previous post adding the 
>> default asserts for systems and cpus
>> 4) pp_assert_serialization.patch - adds serialization support for 
>> assertions and all the required callbacks etc
>>
>> For the serialization I had to add a bit to the IdentifierInfo to 
>> track if the identifier had asserted values to facilitate separate 
>> deserialization of asserts and macros. This resulted in the size of 
>> the emitted IdentifierInfo exceeding the 2 bytes that were allocated 
>> for it so I have increased the size as you can see in the patch.
>>
>> I look forward to hearing from you.
>>
>> Andrew
>>
>> On 04-Dec-11 08:33, Andrew Craik wrote:
>>> Hi Doug,
>>>
>>> Thanks for the detailed feedback on the code style and control flow; all
>>> your comments seem very reasonable and I will go and rework the patch. I
>>> will see if I can add another patch to do the
>>> serialization/de-serialization.
>>>
>>> Andrew
>>>
>>> On 04-Dec-11 06:32, Douglas Gregor wrote:
>>>> On Dec 1, 2011, at 11:46 PM, Andrew Craik wrote:
>>>>
>>>>> Hello,
>>>>>
>>>>> At the LLVM Developers' Meeting a couple of weeks ago I mentioned 
>>>>> that I had a patch to add support to the pre-processor to handle 
>>>>> GCC-style pre-processor assertions.
>>>>> Please find attached two patches:
>>>>> pp_assert_impl.patch - this adds handling of assertions to the 
>>>>> pre-processor
>>>>> pp_assert_default.patch - this adds default cpu and system 
>>>>> assertions to Targets.cpp
>>>>>
>>>>> I look forward to hearing from the community and hopefully getting 
>>>>> these patches integrated.
>>>> Thanks. A bunch of comments below, but generally things are look 
>>>> very good!
>>>>
>>>> Starting with pp_assert_impl.patch:
>>>>
>>>> Index: clang/include/clang/Lex/Preprocessor.h
>>>> ===================================================================
>>>> --- clang.orig/include/clang/Lex/Preprocessor.h2011-11-30 
>>>> 11:39:19.000000000 +1000
>>>> +++ clang/include/clang/Lex/Preprocessor.h2011-12-02 
>>>> 12:23:53.246940900 +1000
>>>> @@ -15,6 +15,7 @@
>>>>   #define LLVM_CLANG_LEX_PREPROCESSOR_H
>>>>
>>>>   #include "clang/Lex/MacroInfo.h"
>>>> +#include "clang/Lex/AssertionInfo.h"
>>>>   #include "clang/Lex/Lexer.h"
>>>>   #include "clang/Lex/PTHLexer.h"
>>>>   #include "clang/Lex/PPCallbacks.h"
>>>> @@ -240,6 +241,10 @@
>>>>     /// to the actual definition of the macro.
>>>>     llvm::DenseMap<IdentifierInfo*, MacroInfo*>   Macros;
>>>>
>>>> +  /// Assertions - For each IdentifierInfo that has appeared as an 
>>>> asserted
>>>> +  /// predicate, we store the values for which it was asserted
>>>> +  llvm::DenseMap<IdentifierInfo*, std::vector<AssertionInfo*> > 
>>>>   AssertedValues;
>>>>
>>>> I'm guessing that the common case is that there is only one 
>>>> asserted value for each assertion, so I suggest replacing the 
>>>> std::vector with a UsuallyTinyPtrVector (which will need to be 
>>>> boosted from AST to Basic).
>>>>
>>>> @@ -440,6 +459,30 @@
>>>>     macro_iterator macro_begin(bool IncludeExternalMacros = true) 
>>>> const;
>>>>     macro_iterator macro_end(bool IncludeExternalMacros = true) const;
>>>>
>>>> +  /// AddAssertionValue - assert the specified identifier to the
>>>> +  /// set of asserted values for the given assertion key
>>>> +  void AddAssertionValue(IdentifierInfo *II, AssertionInfo *AI);
>>>> +
>>>> +  /// IsAsserted - test if the specified identifier has been 
>>>> asserted as a
>>>> +  /// value of the given assertion key returns true iff it has
>>>> +  bool IsAsserted(IdentifierInfo *II, AssertionInfo *AI);
>>>> +
>>>> +  /// RemoveAssertionValue - remove the specified identifier from 
>>>> the set of
>>>> +  /// asserted values for the given assertion key
>>>> +  void RemoveAssertionValue(IdentifierInfo *II, AssertionInfo *AI);
>>>> +
>>>> +  /// RemoveAssertion - remove the specified assertion and all of 
>>>> its asserted
>>>> +  /// values
>>>> +  void RemoveAssertion(IdentifierInfo *II);
>>>>
>>>> A quick request hereā€¦ the LLVM coding standards clarified the 
>>>> naming of functions a few months ago 
>>>> (http://llvm.org/docs/CodingStandards.html#ll_naming), so please 
>>>> name these addAssertionValue, isAsserted, and so on.
>>>> The same comment applies throughout the patch, because we want new 
>>>> code to follow this convention and existing code to eventually 
>>>> migrate over.
>>>>
>>>>
>>>> +    void Destory() {
>>>> +      ValueTokens.clear();
>>>> +      this->~AssertionInfo();
>>>> +    }
>>>>
>>>> Typo "Destory", propagated throughout.
>>>>
>>>> +/// ReadAssertionPredicate - Lex and validate an assertion 
>>>> predicate, which
>>>> +/// occurs after an #assert or #unassert. This sets the token kind 
>>>> to eod and
>>>> +/// discards the rest of the line if the assert is invalid.
>>>> +/// isAssertUnassert is 1 if this is due to an #assert, 2 if #unassert
>>>> +/// directive, 0 if it is something else
>>>> +void Preprocessor::ReadAssertionPredicate(Token&AssertionPredTok,
>>>> +    char isAssertUnassert) {
>>>>
>>>> Please use an enumeration type rather than 0/1/2 here.
>>>>
>>>> Also, why not indicate failure by returning 'true' and success by 
>>>> returning 'false', as we do elsewhere in Clang?
>>>>
>>>> +  IdentifierInfo *II = AssertionPredTok.getIdentifierInfo();
>>>> +  if (II == 0) {
>>>> +    bool Invalid = false;
>>>> +    std::string Spelling = getSpelling(AssertionPredTok,&Invalid);
>>>> +    if (Invalid) {
>>>> +      return;
>>>> +    }
>>>> +
>>>> +    const IdentifierInfo&Info = Identifiers.get(Spelling);
>>>> +    if (Info.isCPlusPlusOperatorKeyword()) {
>>>> +      Diag(AssertionPredTok, 
>>>> diag::err_pp_operator_used_as_assert_predicate)
>>>> +<<   Spelling;
>>>> +    } else {
>>>> +      Diag(AssertionPredTok, diag::err_pp_predicate_not_identifier);
>>>> +    }
>>>> +    // Fall through on error.
>>>> +  } else if (isAssertUnassert&&   II->getPPKeywordID() == 
>>>> tok::pp_assert) {
>>>> +    // Error if using assert as a predicate since using it would 
>>>> require
>>>> +    // #assert
>>>> +    Diag(AssertionPredTok, diag::err_assert_predicate_name);
>>>> +  } else {
>>>> +    // Okay, we got a good identifier node. Return it.
>>>> +    return;
>>>> +  }
>>>>
>>>> If you did switch the return value to 'bool', this code could be 
>>>> de-nested and clarified a bit:
>>>>
>>>> if (IdentifierInfo *II = AssertionPredTok.getIdentifierInfo()) {
>>>>    if (isAssertUnassert&&   II->getPPKeywordID() == tok::pp_assert)
>>>>      return Diag(AssertionPredTok, diag::err_assert_predicate_name);
>>>>
>>>>    return false;
>>>> }
>>>>
>>>> // diagnose non-identifiers.
>>>>
>>>> +//===----------------------------------------------------------------------===//
>>>> +// Preprocessor Assertion Directive Handling.
>>>> +//===----------------------------------------------------------------------===//
>>>> +void Preprocessor::HandleAssertDirective(Token&DTok) {
>>>> +  Token AssertionPredicateTok;
>>>> +  ReadAssertionPredicate(AssertionPredicateTok, 1);
>>>> +
>>>> +  if (AssertionPredicateTok.is 
>>>> <http://AssertionPredicateTok.is>(tok::eod)) {
>>>> +    return;
>>>> +  }
>>>> +
>>>> +  AssertionInfo *AI =
>>>> +    AllocateAssertionInfo(AssertionPredicateTok.getLocation());
>>>> +  Token Tok;
>>>> +  LexUnexpandedToken(Tok);
>>>> +  ReadAssertionValue(Tok, AI);
>>>> +
>>>> +  // add to AI
>>>> +  if (AI != 0) {
>>>>
>>>> I find this interface to ReadAssertionValue to be a bit odd. How 
>>>> about having ReadAssertionValue lex all of the tokens it needs to, 
>>>> then return an allocated AssertionInfo* on success or null to 
>>>> indicate failure? It would simplify control flow for its callers.
>>>>
>>>> +void Preprocessor::HandleUnassertDirective(Token&DTok) {
>>>> +  Token AssertionPredicateTok;
>>>> +  ReadAssertionPredicate(AssertionPredicateTok, 2);
>>>> +
>>>> +  // check to see if we read eod which indicates an error has 
>>>> occurred and
>>>> +  // an appropriate diagnostic has already been generated
>>>> +  if (AssertionPredicateTok.is 
>>>> <http://AssertionPredicateTok.is>(tok::eod)) {
>>>> +    return;
>>>> +  }
>>>> +
>>>> +  Token Tok;
>>>> +  LexUnexpandedToken(Tok);
>>>> +
>>>> +  IdentifierInfo *II = AssertionPredicateTok.getIdentifierInfo();
>>>> +
>>>> +  // we have a valid identifier so we need to create an 
>>>> AssertionInfo to hold
>>>> +  // the value to associate with the predicate
>>>> +  assertion_iterator iter = AssertedValues.find(II);
>>>> +
>>>> +  // we don't have an entry in the assertion table for the 
>>>> predicate, this is a
>>>> +  // noop so escape early
>>>> +  if (iter == assertion_end()) {
>>>> +    return;
>>>> +  }
>>>>
>>>> Shouldn't we still parse the rest of the #unassert to make sure 
>>>> it's well-formed? Or, at the very least, skip to the end of the 
>>>> directive?
>>>>
>>>> +void Preprocessor::ReadAssertionValue(Token&Tok, AssertionInfo *AI) {
>>>> +  if (Tok.isNot(tok::l_paren)) {
>>>> +    // emit diag
>>>> +    ReleaseAssertionInfo(AI);
>>>> +    AI = 0;
>>>> +    return DiscardUntilEndOfDirective();
>>>> +  }
>>>>
>>>> You're missing the diagnostic here.
>>>>
>>>> +  while (true) {
>>>> +    LexUnexpandedToken(Tok);
>>>> +    if (Tok.is <http://Tok.is>(tok::r_paren) || Tok.is 
>>>> <http://Tok.is>(tok::eod)) {
>>>> +      break;
>>>> +    }
>>>> +    AI->AddToken(Tok);
>>>> +  }
>>>>
>>>> Could you add a comment saying that it's intentional that 
>>>> parentheses don't nest?It surprised me, but I see that's the 
>>>> documented behavior, and I don't want someone coming through later 
>>>> and "fixing" it.
>>>>
>>>> +  // check for the closing ), if we don't have it relase the 
>>>> AssetionInfo and
>>>> +  // discard the rest of the directive
>>>> +  if (!Tok.is(tok::r_paren)) {
>>>> +    ReleaseAssertionInfo(AI);
>>>> +    AI = 0;
>>>> +    return DiscardUntilEndOfDirective();
>>>> +  }
>>>>
>>>> Comment typo "relase". Also, shouldn't we emit a diagnostic here?
>>>>
>>>> +  // check there is nothing after the closing ), if we have 
>>>> anything release
>>>> +  // the Assertioninfo and idscard the rest of the directive
>>>> +  LexUnexpandedToken(Tok);
>>>> +  if (!Tok.is(tok::eod)) {
>>>> +    ReleaseAssertionInfo(AI);
>>>> +    AI = 0;
>>>> +    return DiscardUntilEndOfDirective();
>>>> +  }
>>>>
>>>> Diagnostic here?
>>>>
>>>> +    while (true) {
>>>> +      PP.LexUnexpandedToken(PeekTok);
>>>> +      if (PeekTok.is <http://PeekTok.is>(tok::r_paren) || 
>>>> PeekTok.is <http://PeekTok.is>(tok::eod)) {
>>>> +        break;
>>>> +      }
>>>> +      AI->AddToken(PeekTok);
>>>> +    }
>>>>
>>>> Same request for a comment about non-nesting parens.
>>>>
>>>> +    // check for the closing ), if we don't have it relase the 
>>>> AssertionInfo
>>>> +    // and discard the rest of the directive
>>>> +    if (!PeekTok.is(tok::r_paren)) {
>>>> +      PP.Diag(PredicateTok, diag::err_pp_expr_bad_token_start_expr);
>>>> +      PP.ReleaseAssertionInfo(AI);
>>>> +      AI = 0;
>>>> +      return true;
>>>> +    }
>>>>
>>>> "relase" typo in comment.
>>>>
>>>> This patch is looking really good. One non-trivial piece that's 
>>>> still missing is that the AssertedValues map will need to be 
>>>> serialized to the AST file and (lazily) deserialized, so that this 
>>>> feature works with precompiled headers.
>>>>
>>>> pp_assert_default.patch looks good to me.
>>>>
>>>> Thanks for working on this!
>>>>
>>>> - Doug
>>> _______________________________________________
>>> cfe-commits mailing list
>>> cfe-commits at cs.uiuc.edu <mailto:cfe-commits at cs.uiuc.edu>
>>> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits
>> <pp_assert_defaults.patch><pp_assert_impl.patch><pp_assert_serialization.patch><boost_usuallytinyptrvector_to_basic.patch>_______________________________________________
>> cfe-commits mailing list
>> cfe-commits at cs.uiuc.edu <mailto:cfe-commits at cs.uiuc.edu>
>> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20120307/56db1430/attachment.html>


More information about the cfe-dev mailing list