[cfe-dev] Lazily parsing additional source files

Jordan Rose jordan_rose at apple.com
Wed May 21 20:16:07 PDT 2014


Hi, Gábor. If you look far back in the SVN history you can see sketches of where we tried this, with an unimplemented concept of "marshalling" used to get data from one ASTContext to another. As I remember, it didn't go very far because it turns it out it's very difficult to actually match up types and decls from different translation units.

Trying to parse new code could have better luck, though you'd probably have to change the way things are currently set up to not count the main source file as ended. You could still run into trouble if there are, say, static functions with the same name in the other TU, though.

I'm not sure what you mean by "some type information may not be available in those external source files". You can't actually parse C code fully without type information, because certain constructs are ambiguous otherwise.

The approach we've considered before is to come up with some AST-agnostic "summary" of a function, like "the first parameter is never modified even though it's passed as non-const, and the second parameter is always the return value". A more advanced form of this would allow checkers to store information this way as well. Then this summary information could be "applied" at a call site (using the declaration in the primary TU), without having to worry about making the ASTs match up. This summary information could also be persisted, meaning that when you reanalyze the same project you wouldn't have to generate the summaries all over again.

Of course, you don't have to do things this way. I'm just concerned that C is very much structured around the notion of translation units, and that it will be very difficult to handle code outside of that context.

If you have any specific questions, I'll try to answer them fairly promptly. Anna should be coming back soon, too.
Jordan


On May 19, 2014, at 11:37 , Gábor Horváth <xazax.hun at gmail.com> wrote:

> Hi!
> 
> I am working on a Google Summer of Code project to improve the Clang Static Analyzer. In that project it would be essential to parse external source files and inject AST into the translation unit that is being compiled. The external files would contain definitons that are being looked up. The goal would be to avoid runtime cost if no lookup is required. So basicly I want to add new code lazily to an existing AST after parsing is done by injecting new source code. 
> 
> Moreover some type information may not be available in those external source files, so type information in the translation unit that is being analyzed should be utilized.
> 
> What do you think, what would be the most efficient and elegant way to approach this problem? 
> 
> Thanks in advance,
> Gábor
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20140521/c2e686dc/attachment.html>


More information about the cfe-dev mailing list