[cfe-commits] [Patch] Fix for comma-removal ahead of __VA_ARGS__ in macros

Mon Nov 5 15:52:09 PST 2012

On Mon, Nov 5, 2012 at 11:22 AM, Andy Gibbs <andyg1001 at hotmail.co.uk> wrote:
> On Monday, November 05, 2012 6:50 PM, Eli Friedman wrote:
>
>> On Mon, Nov 5, 2012 at 3:07 AM, Andy Gibbs wrote:
>>>
>>> Here is the line from the gcc help: "... In this case the C99 standard is
>>> clear that the comma must remain, however the existing GCC extension used
>>> to
>>> swallow the comma. So CPP retains the comma when conforming to a specific
>>> C
>>> standard, and drops it otherwise."
>>
>>
>> Ah, yes... but that text only applies in the case where the macro has
>> zero named arguments.
>
>
> You are right.  I had overlooked this and not realised this was the case.
>
> Ok, I've extended the test case to the following...
>
> #define A(...)   [ __VA_ARGS__ ]
> #define B(...)   [ , __VA_ARGS__ ]
> #define C(...)   [ , ## __VA_ARGS__ ]
> #define D(A,...) [ A , ## __VA_ARGS__ ]
> #define E(A,...) [ __VA_ARGS__ ## A ]
>
> 1: A()      B()      C()      D()      E()
> 2: A(a)     B(a)     C(a)     D(a)     E(a)
> 3: A(,)     B(,)     C(,)     D(,)     E(,)
> 4: A(a,b,c) B(a,b,c) C(a,b,c) D(a,b,c) E(a,b,c)
> 5: A(a,b,)  B(a,b,)  C(a,b,)  D(a,b,)

It might also be worth considering "5: E(a,b,)", which is an error for
GNU and C99, and is valid in MS compatibility mode, where
token-pasting isn't required to form a valid token, and instead the
result is lexed for (possibly multiple) tokens.

> // GCC: 1: [ ] [ , ] [ ] [ ] [ ]
> // GCC: 2: [ a ] [ , a ] [ ,a ] [ a ] [ a ]
> // GCC: 3: [ , ] [ , , ] [ ,, ] [ , ] [ ]
> // GCC: 4: [ a,b,c ] [ , a,b,c ] [ ,a,b,c ] [ a ,b,c ] [ b,ca ]
> // GCC: 5: [ a,b, ] [ , a,b, ] [ ,a,b, ] [ a ,b, ]

These look correct to me.

> // C99: 1: [ ] [ , ] [ , ] [ ] [ ]
> // C99: 2: [ a ] [ , a ] [ ,a ] [ a ] [ a ]
> // C99: 3: [ , ] [ , , ] [ ,, ] [ , ] [ ]
> // C99: 4: [ a,b,c ] [ , a,b,c ] [ ,a,b,c ] [ a ,b,c ] [ b,ca ]
> // C99: 5: [ a,b, ] [ , a,b, ] [ ,a,b, ] [ a ,b, ]

Yes, these look right too. It looks like the only one of these we get
wrong is 1C (where we also incorrectly produce an error in
-pedantic-errors mode). Many of these have undefined behavior (other
than 1C, we produce -pedantic-errors diagnostics in exactly the right
set of cases).

> // MS: 1: [ ] [ ] [ ] [ ] [ ]
> // MS: 2: [ a ] [ , a ] [ ,a ] [ a ] [ a ]
> // MS: 3: [ , ] [ , , ] [ ,, ] [ ] [ ]
> // MS: 4: [ a,b,c ] [ , a,b,c ] [ ,a,b,c ] [ a ,b,c ] [ b,ca ]
> // MS: 5: [ a,b, ] [ , a,b, ] [ ,a,b, ] [ a ,b, ]

5E: [ b,a ]

> Can you think of any further pertinent tests that I have omitted, or does that now cover it?

That seems like pretty good coverage to me. According to a commenter
on MSDN, the Microsoft implementation has another quirk:

#define F(...) [ a , b __VA_ARGS__ ]
F() // [ a b ]

(note that the comma removed is not next to __VA_ARGS__). Even if
that's true, I'm not convinced that we should replicate that behavior.