[cfe-dev] clang leveraging Elsa?

Wed Oct 3 13:05:21 PDT 2007

>> We have.  The killer problem is that Elsa's implementation will  
>> not allow us to achieve the performance goals of the clang  
>> project.  In addition, Elsa doesn't solve the hard part of C++  
>> parsing (the semantic analysis and type checking), isn't built as  
>> a reusable library (in the way clang aims to be), doesn't get the  
>> corner cases of the languages it parses correct, etc.
> Actually, it does do semantic analysis and type checking.

I'm sorry, I should have been more clear.  Also, my understanding of  
Elsa is somewhat dated, about a year old, so my understandings could  
be obsolete.

What I meant is that Elsa (as I understand it) has enough semantic  
analysis to parse, but is not enforcing all of the constraints  
required by the language.  For example, it does not (AFAIK) correctly  
enforce things like integer constant expressions, constraints on VLAs  
etc.  If your goal is to just parse correct code, this is fine.  If  
you want to correctly enforce the requirements of the language, this  
isn't ok.

> Could you elaborate on what you mean by "isn't built as a reusable  
> library"? API-wise I think it's ok in that regard.

Specifically, lexing, parsing, and AST building are not cleanly  
separable as they are in clang.  In clang, it is possible to  
implement an action module what uses the parser but doesn't  
necessarily build an AST at all.

> I'm also not sure about what you mean by corner cases. Elsa's C  
> support isn't ideal because it pretends that C is a subset of C++.  
> And if by corner cases you mean that it doesn't always store  
> everything it parsed in in the AST, that's also correct. However  
> I've found that filling in missing information is next to trivial  
> and I am not aware of any other shortcomings. Could you elaborate  
> on that point too?

See above: accepting a superset of a language is much harder than  
accepting the language properly.

An additional issue is one of diagnostics.  I haven't successfully  
built elsa, so I can't play with it, but my guess is that the  
diagnostics produced by elsa are not very good.  I would be  
interested to know if that guess is correct or not.  Having  
*extremely good* diagnostics is a very important goal for clang.

>> While we could extend Elsa to complete its support for C++ and  
>> polish the corner cases, fixing the performance issues would  
>> require a complete redesign.  As such, reusing elsa is a non- 
>> starter. :(

> Elsa's support for C++ is fairly complete. In my view, templates  
> are the only crucial part that needs work. It fails on anything  
> beyond simple template instantiation(lucky for me it's barely  
> enough for Mozilla). However, template support is a hard part that  
> will have to dealt with in any C++ frontend.

Yep, if elsa was otherwise ok, I would much rather have us extend it  
instead of reinventing yet another new thing.

> Elsa was also designed with a performance in mind (but I agree that  
> the authors could've done much better). An annoying part of elsa is  
> hand-rolled data structures(one named string!), so I've been  
> considering doing some sort of refactoring to get rid or change  
> some of the obscure data structures. The type system also needs  
> slight redoing . Perhaps we could redo it in clang's image.

Reliance on GLR parsing makes Elsa fundamentally slower than a parser  
that does well constrained look ahead and backtracking.  I am  
actually a fan of GLR parsing, and I think that Elsa's implementation  
is a pretty good one.  However, GLR parsing requires *building  
speculative parse trees* and then eliminating the speculation later.   
In my experience with clang, I've found that anything which does  
memory allocation or touches the heap is orders of magnitude slower  
than something that can avoid it.

I don't see how GLR parsing can be done without a liberal amount of  
heap traffic, but maybe I'm missing something.

>> clang is definitely developed in the open and welcomes  
>> contributors.  However, our C++ support is basically non-existent  
>> (and we don't have anyone really working on it), so Elsa is  
>> probably a better solution to C++ parsing issues in the short  
>> term.  Over the next couple years, I expect the clang C++ support  
>> to come up to the point where it is both industrial quality and  
>> useful for a broad variety of clients.  It also has much better  
>> ObjC support than elsa ;-)

> We have been discussing adding ObjC++ to elsa :)

Nice!

> Me and a number of other people would like an actively maintained C+ 
> + frontend suitable for analysis and source2source. GCC isn't an  
> option, neither is waiting on someone to write another frontend  
> from scratch since that take another half a decade.

Yep, there is a clear need!

> It would be nice if useful parts of elsa were absourbed by clang so  
> I wouldn't have to play catch up with real compilers.  What would  
> you need done to elsa to consider using it in clang? Perhaps it  
> could serve as a stopgap measure while the faster clang C++  
> frontend matures.

I agree, but I don't see how the two can be merged.  Do you have any  
specific suggestion?

-Chris