[cfe-dev] EXTENSION for some diags

Thu Sep 6 05:46:53 PDT 2007

Chris Lattner wrote:-

> clang currently has two distinct notions: we distinguish between what  
> class a particular diagnostic falls into from how it is reported.   
> These two categorizations are used by different pieces of the  
> diagnostic handling machinery.
> 
> The first part of the diagnostics machinery is the piece that is  
> generating the diagnostics: the parser, lexer, sema, checking system,  
> etc.  Each diagnostic itself is tagged (in DiagnosticKinds.def) with  
> a class, which is a member of the set  
> {NOTE,WARNING,EXTENSION,ERROR}.  The important part of the  
> classification at this level is that it is completely independent of  
> how the diagnostic is reported and they have obvious meanings:
> 
> 1. NOTEs (which aren't used and may be removed in the future) are  
> informational messages.  For example, perhaps the optimizer could  
> emit information about which loops are vectorized etc.
> 
> 2. WARNINGs are emitted for legal code that is questionable somehow.
> 
> 3. EXTENSIONs are emitted for illegal code that is accepted anyway as  
> an extension to the language.
> 
> 4. ERRORs flag a malformed program where error recovery techniques  
> must be used to continue parsing.
> 
> I believe that each of these categories is well described and  
> simple.  It is easy to determine which class any particular  
> diagnostic falls into.

So client is just gets one of these severities and it's up to
them how they handle it?

> >a) Each diagnostic tag has a default severity suitable for normal
> >   compiles.  Severities are, in order, suppress, remark, warning,
> >   soft error, hard error, fatal and maybe ice.
> 
> In contrast to your system, we don't specify a "default mapping":  
> that is completely client-specific.  Because the library generating  
> the diagnostic doesn't know/care about how these diagnostics are  
> emitted, we don't need to talk about suppressed diagnostics.  I'm not  
> sure the distinction between soft/hard error in your scheme.

To the client they're both errors.  Only soft errors can have their
severities changed by the user; hard errors are always hard errors.
This was buried in my mail and probably not clear.

> Fatal/ ICE are also interesting: we don't emit a diagnostic for these cases:  
> these are presumably internal consistency failures, where we prefer  
> to have the library assert/abort and die.

Yeah, ICE should probably become fatal.  #error is a good example
of a fatal error - the std specifies translation should halt (something
GCC gets wrong but it's hard to fix in GCC because of its architecture).

> When diagnostics are reported by the parser (etc), they get routed  
> through the Diagnostic class.  This class is designed to do various  
> types of mapping, which turns the class above into a concrete  
> diagnostic level.  The level is completely different from the class,  
> but they have similar names (and are thus somewhat confusing).  The  
> Diagnostic class maps each diagnostic onto the level set, which is  
> { Ignored, Note, Warning, Error, Fatal }.
> 
> These levels are what we expect the tool to report to the user.  As  
> you might expect, the {note,warning,extension} classes can be mapped  
> onto any of these levels, and the {error} class can only be mapped  
> onto {error,fatal}.

OK.  But is this mapping per-tag, or blanket for all tags?  If the
former, it seems unduly inflexible, and if the latter then why
bother having a mapping (i.e. a severity at the raising side) at
all?  Why not just provide a diagnostic tag and leave it to the
client what they want to do with it?  I'm struggling to understand
the rationale for the two-level nature of it.

> I believe that this system is fully general: any interesting
policy
> the client wants can be implemented with this scheme  
> (Diagnostic::setDiagnosticMapping can be used to specify a level on a  
> per-diagnostic basis), and higher level policies (e.g. -Werror, - 
> pedantic, -pedantic-errors etc) can be easily handled by mapping  
> entire classes to levels.

This kind of answers my question above I think, but my confusion
remains.

What about diagnostics, such as for line comments, that is an
extension in one dialect but not another?  Either you're duplicating
the diagnostic tag, which means presenting the user with two tags
for the same diagnostic, or you're requiring clients to know what
is and isn't an extension in various dialects.

What about integer division by zero?  It should be a hard error in
some contexts and not others, even for a single dialect.  How does
the client deal with that?  It seems the only solution is two tags.
I can imagine situations where essentially the same diagnostic
needs to become three or four tags with multiple dialect support.

> >Something I've not implemented yet is diagnostic groups, so
> >you could give groups of tags a name and manipulate them
> >en-masse, e.g. portability warnings, unused warnings, etc.
> 
> This is also something that clang is missing.  In particular, the  
> client would like to be able to manipulate groups at a coarse grain  
> to implement controls such as the GCC options -Wextra, -Wall, -W,  
> etc.  These are basically controls that modify a whole set of  
> diagnostics.  However, it is important to note that the groupings are  
> client specific, so this should be implemented in the driver, not in  
> the libraries.
> 
> The original topic of discussion was what "extensions" should be  
> warned about even when -pedantic is not specified.  In contrast to my  
> original proposal, I now don't think this should be another class of  
> warning, I think this should be controlled with the (currently  
> missing) diagnostic grouping mechanism in the driver.
> 
> Is this design reasonable?

I can see the reasoning behind it, but I'm not convinced it's the
best approach for the reasons above.  I may be misunderstanding
how it works.

One other thing - raising a diagnostic only for it to be suppressed
at the client side can be expensive when various other information
has to be calculated just to raise it.  For example, a diagnostic
indicating that one decl hides another.  Depending on your symbol
table, even determining that something hides something else, or
what kind of entity it's hiding (nice to have that in the diagnostic)
can take time.  Another example is coming up with the English
reprentation of types for diagnostics.  You don't want to do all
that work only to find the diagnostic is suppressed anyway.  How
do you intend to handle such things?

Neil.