Nasty parsing bug?

Fri Aug 15 11:45:36 PDT 2014

On Fri, Aug 15, 2014 at 1:55 PM, Abramo Bagnara
<abramo.bagnara at bugseng.com> wrote:
> Il 15/08/2014 16:51, Aaron Ballman ha scritto:
>> On Fri, Aug 15, 2014 at 10:01 AM, Abramo Bagnara
>> <abramo.bagnara at bugseng.com> wrote:
>>> Il 15/08/2014 15:46, Aaron Ballman ha scritto:
>>>> On Fri, Aug 15, 2014 at 9:04 AM, Abramo Bagnara
>>>> <abramo.bagnara at bugseng.com> wrote:
>>>>> Il 15/08/2014 13:51, Aaron Ballman ha scritto:
>>>>>> On Fri, Aug 15, 2014 at 4:10 AM, Abramo Bagnara
>>>>>> <abramo.bagnara at bugseng.com> wrote:
>>>>>>> Il 07/08/2014 15:42, Aaron Ballman ha scritto:
>>>>>>>> On Thu, Aug 7, 2014 at 9:30 AM, Abramo Bagnara <abramo.bagnara at gmail.com> wrote:
>>>>>>>>> On 07/08/2014 15:15, Aaron Ballman wrote:
>>>>>>>>>> On Thu, Aug 7, 2014 at 5:39 AM, Abramo Bagnara
>>>>>>>>>> <abramo.bagnara at bugseng.com> wrote:
>>>>>>>>>>> In the following source
>>>>>>>>>>>
>>>>>>>>>>> enum e { a };
>>>>>>>>>>> enum e __attribute__((unused)) x;
>>>>>>>>>>> enum e
>>>>>>>>>>> __attribute__((unused)) y;
>>>>>>>>>>>
>>>>>>>>>>> the declaration of x is parsed without errors, but the declaration of y
>>>>>>>>>>> gives a parse error i.e. the presence of the newline makes a difference (!)
>>>>>>>>>>>
>>>>>>>>>>> Indeed I cannot imagine a reason why the parsing should be influenced by
>>>>>>>>>>> token position.
>>>>>>>>>>>
>>>>>>>>>>> Opinions anyone?
>>>>>>>>>>
>>>>>>>>>> Both are illegal parses -- a GNU-style attribute cannot appear in that
>>>>>>>>>> location.
>>>>>>>>>
>>>>>>>>> Hmmm... but GCC compiles the source without any warning...
>>>>>>>>>
>>>>>>>>> Do you mean it is an undocumented feature of GCC or a GCC bug?
>>>>>>>>
>>>>>>>> Undocumented feature, as far as I can tell. That would attach the
>>>>>>>> attribute to the enum e type based on position, but it seems to
>>>>>>>> appertain to the declaration instead.
>>>>>>>>
>>>>>>>> But if GCC accepts this, we may want to consider implementing it as
>>>>>>>> well if there's sufficient reason.
>>>>>>>
>>>>>>> As we already behave in a GCC compatible way parsing
>>>>>>>
>>>>>>> struct s
>>>>>>> __attribute__((used)) x;
>>>>>>> int __attribute__((used)) y;
>>>>>>>
>>>>>>> I think that the proper fix is the attached one (i.e. we don't want to
>>>>>>> detect as illegal something that later we will accept)
>>>>>>>
>>>>>>> Ok to commit?
>>>>>>
>>>>>> Except that if GCC considers that parsing to be a bug, then all we're
>>>>>> really doing is enforcing their bug. Do you know of GCC's stance on
>>>>>> whether this is a bug or not? If it's not a bug, but something they
>>>>>> want to continue supporting, then I think we should fix it as well.
>>>>>
>>>>> This is clearly documented in
>>>>> https://gcc.gnu.org/onlinedocs/gcc/Attribute-Syntax.html
>>>>>
>>>>>   For compatibility with existing code written for compiler versions
>>>>>   that did not implement attributes on nested declarators, some laxity is
>>>>>   allowed in the placing of attributes. If an attribute that only applies
>>>>>   to types is applied to a declaration, it is treated as applying to the
>>>>>   type of that declaration. If an attribute that only applies to
>>>>>   declarations is applied to the type of a declaration, it is treated as
>>>>>   applying to that declaration; and, for compatibility with code placing
>>>>>   the attributes immediately before the identifier declared, such an
>>>>>   attribute applied to a function return type is treated as applying to
>>>>>   the function type, and such an attribute applied to an array element
>>>>>   type is treated as applying to the array type. If an attribute that only
>>>>>   applies to function types is applied to a pointer-to-function type, it
>>>>>   is treated as applying to the pointer target type; if such an attribute
>>>>>   is applied to a function return type that is not a pointer-to-function
>>>>>   type, it is treated as applying to the function type.
>>>>>
>>>>
>>>> The example directly preceding the paragraph you cited is what I
>>>> believe the issue is:
>>>>
>>>>   char *__attribute__((aligned(8))) *f;
>>>>
>>>>   specifies the type “pointer to 8-byte-aligned pointer to char”. Note
>>>> again that this does not work with
>>>>   most attributes; for example, the usage of ‘aligned’ and ‘noreturn’
>>>> attributes given above is not yet supported.
>>>>
>>>> Your code make no distinction between type attributes and declaration
>>>> attributes. GCC's documentation bears out that they don't either, but
>>>> that it is intended to be attribute-specific eventually. That's why I
>>>> don't think we should be implementing it this way.
>>>>
>>>> In Sema/attr-used.c, the definition of struct s properly diagnoses
>>>> that used only applies to variables and functions, but your elaborated
>>>> type specifier on the declaration doesn't warn despite the attribute
>>>> being in the same position and still needing to appertain to the type.
>>>>
>>>> So on the one hand, this is more compatible with how GCC behaves now,
>>>> but it's not how they document that it should behave, which is why I'm
>>>> wondering what problem this solves in practice.
>>>
>>> I think there is a misunderstanding: currently clang already parses
>>>
>>> struct s __attribute__((used)) x;
>>> int __attribute__((used)) y;
>>>
>>> as declaration attributes despite that their position are more suitable
>>> for type attributes. This is due to the fact that the attribute is
>>> declaration specific and it is perfectly congruent with what GCC
>>> documentation says wrt backward compatibility (i.e. "If an attribute
>>> that only applies to declarations is applied to the type of a
>>> declaration, it is treated as applying to that declaration").
>>
>> It is congruent by chance, not design.
>>
>>> The patch addresses *only* a specific bug in trunk where
>>>
>>> struct s __attribute__((used)) x;
>>>
>>> and
>>>
>>> struct s // Note the presence of newline here
>>> __attribute__((used)) x;
>>>
>>> are parsed differently (the latter output a parse error with a
>>> misleading message).
>>
>> We're in agreement that these should not *parse* differently.
>>
>>> The testcase simply ensure that this will always be accepted just as the
>>> variant without newline.
>>
>> The behavior right now is that code which isn't valid is being
>> diagnosed as invalid in an unhelpful way in some situations, and not
>> being diagnosed at all in other situations. The proposed behavior is
>> to stop diagnosing entirely in all situations despite the fact that
>> the code isn't valid by design, just by chance. That doesn't seem like
>> a step in the right direction on its face, which is why I was asking
>> what problem this actually solves.
>>
>> What I think ultimately needs to happen is: 1) fixing the parse, 2)
>> fixing the semantics, 3) behaving according to the documentation of
>> what GCC says *should* be happening, 4) diagnosing resulting issues
>> consistently. Note, I am not suggesting *you* need to do this work.
>> :-)
>>
>> Does that make more sense?
>>
>> That being said, I think I've talked myself into this patch being
>> reasonable since it at least covers #1 and we can incrementally solve
>> 2-4 if we remain aware of them. We can fix the parse (which you've
>> done), add FIXMEs to the code explaining the situation, move your
>> current test case into Parser instead of Sema, add a FIXME test to
>> your current test case in Sema stating that the code *should* diagnose
>> according to GCC's documentation, and perhaps file a bug to track it
>> all. That will ease my worry about things being forgotten. If you deal
>> with the code stuff in your patch, I can write up the bug so we don't
>> lose track of this.
>
> Please check if the attached patch conforms to what you meant.

Very close!

> +  // FIXME: we should emit diagnostics about misplaced declaration attribute
> +  // parsed for compatibility with GCC and existing code

Missing a period in the comment; should probably say "semantic
diagnostic when declaration attribute is in type attribute position"
since it's not really a parser FIXME, just that this is the jumping
off point for the fix.

> Index: test/Parser/attributes.c
> ===================================================================
> --- test/Parser/attributes.c (revisione 215658)
> +++ test/Parser/attributes.c (copia locale)
> @@ -95,3 +95,12 @@
>  __attribute__((pure)) int testFundef6(int a) { return a; }
>
>  void deprecatedTestFun(void) __attribute__((deprecated()));
> +
> +struct s {
> +  int a;
> +};
> +
> +// FIXME: we should emit diagnostics about misplaced declaration attribute
> +// parsed for compatibility with GCC and existing code
> +struct s
> +__attribute__((used)) bar;

This code should be commented that it's in place to ensure
compatibility with parsing GNU-style attributes where the attribute is
on a separate line from the elaborated type specifier.

We should then have a new test in Sema (gnu-attributes.c?) that does
something like:

struct s {};

// FIXME: should warn that declaration attribute in type position is
being applied to the declaration instead?
struct s __attribute__((used)) foo;

// FIXME: Should warn that type attribute in declaration position is
being applied to the type instead?
struct s  *bar __attribute__((address_space(1)));

// Should not warn because type attribute is in type position.
struct s *__attribute__((address_space(1))) baz;

// Should not warn because declaration attribute is in declaration position.
struct s *quux __attribute__((used));

We can then expand this file with all of the other insane places you
can put GNU-style attributes.

With those changes, LGTM!

~Aaron