[cfe-dev] clang leveraging Elsa?

Wed Oct 3 23:32:57 PDT 2007

On Oct 3, 2007, at 12:49 PM, Benjamin Meyer wrote:

> The past few months I have been writing many tools with Roberto
> Raggi's c++ preprocessor and parser.  It is very fast and I have
> enjoyed messing with it.  As you (chris) already know exactly what
> you guys need/want and what would make a good parser I am very
> curious what you can say about it  (where it is good/bad, what it is
> missing etc)
>
> The one I have been using can be found in this package:
>
> ftp://ftp.trolltech.com/qtjambi/source/qtjambi-gpl-src-4.3.0_01.tar.gz
>
> Located in: generator/parser/ and the preprocessor is in generator/
> parser/rpp

It is somewhat irritating to me that there is almost no comments for  
this: it seems well thought out and written. Is there any out of line  
documentation available?

Overall, it is an impressive piece of work.  There are some minor  
strange (to me) design decisions: for example, what is ConditionAST,  
why does it exist?

The ASTs produced seem to be a bit heavier-weight than the clang  
ASTs, and relies on the entire lexed token stream being available to  
interpret the location info.  However, in my first few minutes  
looking at it, I don't think that it shares the "fatal flaws" (from  
the clang perspective only, obviously) in its design or  
implementation that elsa has.  As a matter of fact, while the details  
differ significantly, its design is somewhat similar to clangs,  
validating clang's design ;-).  One thing that is impossible for me  
to do from inspection is to determine how complete the parser is.

Since I don't have it built and you do, here are some questions for  
you: :)

1) looking at the preprocessor, the implementation doesn't look  
particularly speedy.  It is using std::strings to push text around.   
Have you timed the preprocessor on large inputs to see how fast it  
really is?
2) the preprocessor seems to get the 90% case right, but doesn't seem  
to be fully conformant.  Do you have any idea whether it has been  
tested against the hard cases in the standard?  For example, the  
clang/test/Preprocessor directory has some example hard cases.
3) does the code handle nasty features like trigraphs?
4) how good is the C++ support?  It seems like there is significant  
coverage for a big chunk of the language, but it seems like pieces  
are missing.  Without at least partial template instantiation support  
you can't correctly parse some C++ code for example.  Note that this  
requires full handling of template specialization etc.  Are there  
known holes/deficiencies?
5) it looks like a lot of semantic checks are missing.  Is there  
anything that talks about the current state of the parser?  It also  
reads and ignores lots of stuff, even simple things like break/ 
continue/goto stmts.

-Chris