[cfe-dev] EXTENSION for some diags

Thu Sep 6 11:27:12 PDT 2007

>> Each diagnostic itself is tagged (in DiagnosticKinds.def) with
>> a class, which is a member of the set
>> {NOTE,WARNING,EXTENSION,ERROR}.  The important part of the
>> classification at this level is that it is completely independent of
>> how the diagnostic is reported and they have obvious meanings:
>
> So client is just gets one of these severities and it's up to
> them how they handle it?

These are classifications, not severities.  The Diagnostic class maps  
this onto a "level" which is basically a severity.  The client  
implements the DiagnosticClient interface which is passed a level.

>>> a) Each diagnostic tag has a default severity suitable for normal
>>>   compiles.  Severities are, in order, suppress, remark, warning,
>>>   soft error, hard error, fatal and maybe ice.
>>
>> In contrast to your system, we don't specify a "default mapping":
>> that is completely client-specific.  Because the library generating
>> the diagnostic doesn't know/care about how these diagnostics are
>> emitted, we don't need to talk about suppressed diagnostics.  I'm not
>> sure the distinction between soft/hard error in your scheme.
>
> To the client they're both errors.  Only soft errors can have their
> severities changed by the user; hard errors are always hard errors.
> This was buried in my mail and probably not clear.

Ok, so soft errors don't require recovery I assume.

>> Fatal/ ICE are also interesting: we don't emit a diagnostic for  
>> these cases:
>> these are presumably internal consistency failures, where we prefer
>> to have the library assert/abort and die.
>
> Yeah, ICE should probably become fatal.  #error is a good example
> of a fatal error - the std specifies translation should halt  
> (something
> GCC gets wrong but it's hard to fix in GCC because of its  
> architecture).

Interesting, good point.  In clang, the client can choose to map it  
either to error or fatal.

>> When diagnostics are reported by the parser (etc), they get routed
>> through the Diagnostic class.  This class is designed to do various
>> types of mapping, which turns the class above into a concrete
>> diagnostic level.  The level is completely different from the class,
>> but they have similar names (and are thus somewhat confusing).  The
>> Diagnostic class maps each diagnostic onto the level set, which is
>> { Ignored, Note, Warning, Error, Fatal }.
>>
>> These levels are what we expect the tool to report to the user.  As
>> you might expect, the {note,warning,extension} classes can be mapped
>> onto any of these levels, and the {error} class can only be mapped
>> onto {error,fatal}.
>
> OK.  But is this mapping per-tag, or blanket for all tags?

The mapping is per-diagnostic.

> If the
> former, it seems unduly inflexible, and if the latter then why
> bother having a mapping (i.e. a severity at the raising side) at
> all?  Why not just provide a diagnostic tag and leave it to the
> client what they want to do with it?  I'm struggling to understand
> the rationale for the two-level nature of it.

There are three separate stages here: producing diagnostics, mapping  
them, and reporting them somehow.  In general, the parser just  
unconditionally emits diagnostics in some cases without care for how  
it will be reported.

In other cases, computing whether the diagnostic should be emitted is  
expensive.  For example "diag::pp_macro_not_used" requires looping  
over all macro definitions to determine whether any of them are  
unused.  This warning is usually not enabled, so doing this  
computation is wasted work. To handle this, the parser makes a query  
the mapping code to see what level it will be mapped onto.  If it is  
mapped onto ignore, the macro table isn't walked (the code is at  
Preprocessor.cpp:1112).

Separating the mapping from reporting thus provides two features: 1)  
the parser can see how diagnostics will get mapped.  2) the mapping  
logic is shared among all the clients.

>> I believe that this system is fully general: any interesting
> policy
>> the client wants can be implemented with this scheme
>> (Diagnostic::setDiagnosticMapping can be used to specify a level on a
>> per-diagnostic basis), and higher level policies (e.g. -Werror, -
>> pedantic, -pedantic-errors etc) can be easily handled by mapping
>> entire classes to levels.
>
> This kind of answers my question above I think, but my confusion
> remains.
>
> What about diagnostics, such as for line comments, that is an
> extension in one dialect but not another?

I assume you mean "//" comments?  In this case, we have logic like  
this (Lexer.cpp):

   if (!Features.BCPLComment) {
     Diag(BufferPtr, diag::ext_bcpl_comment);

     // Mark them enabled so we only emit one warning for this  
translation
     // unit.
     Features.BCPLComment = true;
   }

If // comments aren't an extension in this dialect, then the warning  
is never produced, so it doesn't matter how it is mapped.

> Either you're duplicating
> the diagnostic tag, which means presenting the user with two tags
> for the same diagnostic, or you're requiring clients to know what
> is and isn't an extension in various dialects.

We do sometimes have to duplicate the tag, for example if it's an  
extension in one dialect but a warning in another.  This is actually  
ok though IMO, because you want the message to be different for the  
different cases.

> What about integer division by zero?  It should be a hard error in
> some contexts and not others, even for a single dialect.  How does
> the client deal with that?  It seems the only solution is two tags.
> I can imagine situations where essentially the same diagnostic
> needs to become three or four tags with multiple dialect support.

We'd use two tags: this makes sense because the message should be  
different.  In general, I'd like to have many fine-grained  
diagnostics with extremely helpful messages than trying to squish  
down the number we have.

>> Is this design reasonable?
>
> I can see the reasoning behind it, but I'm not convinced it's the
> best approach for the reasons above.  I may be misunderstanding
> how it works.

Please let me know if the above helps :)

> One other thing - raising a diagnostic only for it to be suppressed
> at the client side can be expensive when various other information
> has to be calculated just to raise it.  For example, a diagnostic
> indicating that one decl hides another.  Depending on your symbol
> table, even determining that something hides something else, or
> what kind of entity it's hiding (nice to have that in the diagnostic)
> can take time.  Another example is coming up with the English
> reprentation of types for diagnostics.  You don't want to do all
> that work only to find the diagnostic is suppressed anyway.  How
> do you intend to handle such things?

I covered this above.

-Chris