[cfe-dev] Clang parser diagnostics

Fri Aug 21 21:53:44 PDT 2009

Hello everyone.

First, let me say I'm very impressed with both the clang and LLVM
projects.  The quality of the clang source code is so good that I've
learned a lot about how compilers work just from reading it.

In fact, I am writing a compiler for a different language using clang
as a sort of "design guide".  So far I've gotten to the parser, and
I've noticed something interesting about the way clang generates parse
error diagnostics.

If you type something like:

"x = 3 * * 4"

it will point to the second "*" and say "expected expression", because
a node for the operator precedence parser was expected.  However, this
is not really in keeping with the design philosophy of very expressive
error messages.

If I understand it right, the problem is this: you want to be sure you
don't generate multiple errors that are really all the same error.
Thus when ParseCastExpression (or some other very basic rule) fails,
you want to generate the one and only error, and return an invalid
OwningExprResult which will unwind every single expression parsing
production.  As they unwind, each production remains silent about the
fact that is has failed to parse, because obviously it has: something
it directly depends on is invalid.

However, the caller of ParseCastExpression is aware of more useful
information about why the expression was expected, so it could give a
more useful message like "a binary operator should not follow another
binary operator".  For this though, every user of ParseCastExpression
would need its own diagnostic reporting code which I'd image would get
sloppy.  From what I understand, this is why you introduced "notes",
which are a very nice idea even though they don't seem to be present
in this case.  However, that made me think of this situation:

x = 3 * (a really complicated and malformed parenthetical expression)

as you exit from somewhere in there, you will get a wall of "note
spam" and probably most of it is not that necessary.  The user kind of
gets the point about what went wrong after the first one or two notes,
and then the rest would actually be more confusing to show (like those
"instantiated from here..." template errors).  I can think of a whole
bunch of ways to handle this, but I don't really like any of them.  So
I thought I'd ask what (if anything) you guys plan to do in the
future, since I usually like the ideas in clang.  Perhaps tons of
notes should always be generated, and it is the responsibility of the
user of the diagnostic client to wade through them?  I don't want to
pick something, let my parser get much bigger with diagnostics
everywhere, then change the design later if someone has already got a
better idea.

Also, I am using C++ because of LLVM, so I am looking forward to using
clang as a compiler.  But the impressive IDE's that clang technology
will enable is probably what I'd wait for before leaving the comfort
of Microsoft's Visual Studio because I really dislike the unix-style
build system.  I assume that Xcode will be the first to really use
clang like this because of the Apple relationship.  The status of C++
support is easy to follow on the commits list but I was wondering:
whats the status of the IDE?  Is there any plan at this point or is it
still too far in the future?

Thanks,
Ken Camann