Nasty parsing bug?

Fri Aug 15 07:51:07 PDT 2014

On Fri, Aug 15, 2014 at 10:01 AM, Abramo Bagnara
<abramo.bagnara at bugseng.com> wrote:
> Il 15/08/2014 15:46, Aaron Ballman ha scritto:
>> On Fri, Aug 15, 2014 at 9:04 AM, Abramo Bagnara
>> <abramo.bagnara at bugseng.com> wrote:
>>> Il 15/08/2014 13:51, Aaron Ballman ha scritto:
>>>> On Fri, Aug 15, 2014 at 4:10 AM, Abramo Bagnara
>>>> <abramo.bagnara at bugseng.com> wrote:
>>>>> Il 07/08/2014 15:42, Aaron Ballman ha scritto:
>>>>>> On Thu, Aug 7, 2014 at 9:30 AM, Abramo Bagnara <abramo.bagnara at gmail.com> wrote:
>>>>>>> On 07/08/2014 15:15, Aaron Ballman wrote:
>>>>>>>> On Thu, Aug 7, 2014 at 5:39 AM, Abramo Bagnara
>>>>>>>> <abramo.bagnara at bugseng.com> wrote:
>>>>>>>>> In the following source
>>>>>>>>>
>>>>>>>>> enum e { a };
>>>>>>>>> enum e __attribute__((unused)) x;
>>>>>>>>> enum e
>>>>>>>>> __attribute__((unused)) y;
>>>>>>>>>
>>>>>>>>> the declaration of x is parsed without errors, but the declaration of y
>>>>>>>>> gives a parse error i.e. the presence of the newline makes a difference (!)
>>>>>>>>>
>>>>>>>>> Indeed I cannot imagine a reason why the parsing should be influenced by
>>>>>>>>> token position.
>>>>>>>>>
>>>>>>>>> Opinions anyone?
>>>>>>>>
>>>>>>>> Both are illegal parses -- a GNU-style attribute cannot appear in that
>>>>>>>> location.
>>>>>>>
>>>>>>> Hmmm... but GCC compiles the source without any warning...
>>>>>>>
>>>>>>> Do you mean it is an undocumented feature of GCC or a GCC bug?
>>>>>>
>>>>>> Undocumented feature, as far as I can tell. That would attach the
>>>>>> attribute to the enum e type based on position, but it seems to
>>>>>> appertain to the declaration instead.
>>>>>>
>>>>>> But if GCC accepts this, we may want to consider implementing it as
>>>>>> well if there's sufficient reason.
>>>>>
>>>>> As we already behave in a GCC compatible way parsing
>>>>>
>>>>> struct s
>>>>> __attribute__((used)) x;
>>>>> int __attribute__((used)) y;
>>>>>
>>>>> I think that the proper fix is the attached one (i.e. we don't want to
>>>>> detect as illegal something that later we will accept)
>>>>>
>>>>> Ok to commit?
>>>>
>>>> Except that if GCC considers that parsing to be a bug, then all we're
>>>> really doing is enforcing their bug. Do you know of GCC's stance on
>>>> whether this is a bug or not? If it's not a bug, but something they
>>>> want to continue supporting, then I think we should fix it as well.
>>>
>>> This is clearly documented in
>>> https://gcc.gnu.org/onlinedocs/gcc/Attribute-Syntax.html
>>>
>>>   For compatibility with existing code written for compiler versions
>>>   that did not implement attributes on nested declarators, some laxity is
>>>   allowed in the placing of attributes. If an attribute that only applies
>>>   to types is applied to a declaration, it is treated as applying to the
>>>   type of that declaration. If an attribute that only applies to
>>>   declarations is applied to the type of a declaration, it is treated as
>>>   applying to that declaration; and, for compatibility with code placing
>>>   the attributes immediately before the identifier declared, such an
>>>   attribute applied to a function return type is treated as applying to
>>>   the function type, and such an attribute applied to an array element
>>>   type is treated as applying to the array type. If an attribute that only
>>>   applies to function types is applied to a pointer-to-function type, it
>>>   is treated as applying to the pointer target type; if such an attribute
>>>   is applied to a function return type that is not a pointer-to-function
>>>   type, it is treated as applying to the function type.
>>>
>>
>> The example directly preceding the paragraph you cited is what I
>> believe the issue is:
>>
>>   char *__attribute__((aligned(8))) *f;
>>
>>   specifies the type “pointer to 8-byte-aligned pointer to char”. Note
>> again that this does not work with
>>   most attributes; for example, the usage of ‘aligned’ and ‘noreturn’
>> attributes given above is not yet supported.
>>
>> Your code make no distinction between type attributes and declaration
>> attributes. GCC's documentation bears out that they don't either, but
>> that it is intended to be attribute-specific eventually. That's why I
>> don't think we should be implementing it this way.
>>
>> In Sema/attr-used.c, the definition of struct s properly diagnoses
>> that used only applies to variables and functions, but your elaborated
>> type specifier on the declaration doesn't warn despite the attribute
>> being in the same position and still needing to appertain to the type.
>>
>> So on the one hand, this is more compatible with how GCC behaves now,
>> but it's not how they document that it should behave, which is why I'm
>> wondering what problem this solves in practice.
>
> I think there is a misunderstanding: currently clang already parses
>
> struct s __attribute__((used)) x;
> int __attribute__((used)) y;
>
> as declaration attributes despite that their position are more suitable
> for type attributes. This is due to the fact that the attribute is
> declaration specific and it is perfectly congruent with what GCC
> documentation says wrt backward compatibility (i.e. "If an attribute
> that only applies to declarations is applied to the type of a
> declaration, it is treated as applying to that declaration").

It is congruent by chance, not design.

> The patch addresses *only* a specific bug in trunk where
>
> struct s __attribute__((used)) x;
>
> and
>
> struct s // Note the presence of newline here
> __attribute__((used)) x;
>
> are parsed differently (the latter output a parse error with a
> misleading message).

We're in agreement that these should not *parse* differently.

> The testcase simply ensure that this will always be accepted just as the
> variant without newline.

The behavior right now is that code which isn't valid is being
diagnosed as invalid in an unhelpful way in some situations, and not
being diagnosed at all in other situations. The proposed behavior is
to stop diagnosing entirely in all situations despite the fact that
the code isn't valid by design, just by chance. That doesn't seem like
a step in the right direction on its face, which is why I was asking
what problem this actually solves.

What I think ultimately needs to happen is: 1) fixing the parse, 2)
fixing the semantics, 3) behaving according to the documentation of
what GCC says *should* be happening, 4) diagnosing resulting issues
consistently. Note, I am not suggesting *you* need to do this work.
:-)

Does that make more sense?

That being said, I think I've talked myself into this patch being
reasonable since it at least covers #1 and we can incrementally solve
2-4 if we remain aware of them. We can fix the parse (which you've
done), add FIXMEs to the code explaining the situation, move your
current test case into Parser instead of Sema, add a FIXME test to
your current test case in Sema stating that the code *should* diagnose
according to GCC's documentation, and perhaps file a bug to track it
all. That will ease my worry about things being forgotten. If you deal
with the code stuff in your patch, I can write up the bug so we don't
lose track of this.

~Aaron