[cfe-dev] clang leveraging Elsa?
clattner at apple.com
Wed Oct 3 23:32:57 PDT 2007
On Oct 3, 2007, at 12:49 PM, Benjamin Meyer wrote:
> The past few months I have been writing many tools with Roberto
> Raggi's c++ preprocessor and parser. It is very fast and I have
> enjoyed messing with it. As you (chris) already know exactly what
> you guys need/want and what would make a good parser I am very
> curious what you can say about it (where it is good/bad, what it is
> missing etc)
> The one I have been using can be found in this package:
> Located in: generator/parser/ and the preprocessor is in generator/
It is somewhat irritating to me that there is almost no comments for
this: it seems well thought out and written. Is there any out of line
Overall, it is an impressive piece of work. There are some minor
strange (to me) design decisions: for example, what is ConditionAST,
why does it exist?
The ASTs produced seem to be a bit heavier-weight than the clang
ASTs, and relies on the entire lexed token stream being available to
interpret the location info. However, in my first few minutes
looking at it, I don't think that it shares the "fatal flaws" (from
the clang perspective only, obviously) in its design or
implementation that elsa has. As a matter of fact, while the details
differ significantly, its design is somewhat similar to clangs,
validating clang's design ;-). One thing that is impossible for me
to do from inspection is to determine how complete the parser is.
Since I don't have it built and you do, here are some questions for
1) looking at the preprocessor, the implementation doesn't look
particularly speedy. It is using std::strings to push text around.
Have you timed the preprocessor on large inputs to see how fast it
2) the preprocessor seems to get the 90% case right, but doesn't seem
to be fully conformant. Do you have any idea whether it has been
tested against the hard cases in the standard? For example, the
clang/test/Preprocessor directory has some example hard cases.
3) does the code handle nasty features like trigraphs?
4) how good is the C++ support? It seems like there is significant
coverage for a big chunk of the language, but it seems like pieces
are missing. Without at least partial template instantiation support
you can't correctly parse some C++ code for example. Note that this
requires full handling of template specialization etc. Are there
5) it looks like a lot of semantic checks are missing. Is there
anything that talks about the current state of the parser? It also
reads and ignores lots of stuff, even simple things like break/
More information about the cfe-dev