[cfe-dev] facebook clang plugins

Mathieu Baudet mathieubaudet at fb.com
Tue Jun 24 09:47:51 PDT 2014


Thanks for the feedback, Sean, Manuel. I had not thought about the ASTMatchers before, but this sounds interesting (see comments below, though).

On my side, here are a few thoughts on extending the “ASTExporter" plugin to make it useful upstream. Let me stress that this is all highly speculative, and that I am not promising anything :)

0) With some tweaks, it seems feasible to code an ASCII-art tree-like output for the ASTExporter, that is close enough to the current ASTDumper.

1) Similarly, it should be easy to make the ASTExporter emit binary outputs instead of Json (e.g. in this format: http://mjambon.com/biniou-format.txt which ocaml understands)

2) Then, one could write a dual “ASTImporter”, together with a binary parsing library, so as to provide a full alternative solution for AST (de)serialization.

At this stage, the resulting code would be about the same size, but arguably more modular than the current binary (de)serialization (since it would support Json and possibly text outputs).
The binary and Json (de)serialization could be tested alone (write then read) and also by interoperating with Ocaml (if we choose the format above).
To be fair, the whole thing could also be a little less efficient (in size and time) because of the use of a uniform format.

3) With more work, the schema (currently “atd” annotations) could be used to generate a in-memory representation of the AST in terms of tree-like plain data (“PODS"). The ASTExporter and ASTImporter classes, plus appropriate generated modules, would take care of the translation to and from this “protocol" representation.

At this stage, it should be rather easy to meta-generate visitors and perhaps matchers on these PODS. However, be aware that one would not visit/match the original AST, but a copy of it, with a different style of data structures. To me, this observation will always hold if we try to generate visitors or matchers directly (i.e. without generating first an in-memory copy of the AST) from a language-agnostic schema.
Lastly, I wouldn’t expect a meta-generated API to match an existing handwritten API word for word, even if the general style can be maintained.

— Mathieu

On Jun 21, 2014, at 4:16 PM, Sean Silva <chisophugis at gmail.com<mailto:chisophugis at gmail.com>> wrote:




On Sat, Jun 21, 2014 at 3:52 AM, Manuel Klimek <klimek at google.com<mailto:klimek at google.com>> wrote:
On Sat, Jun 21, 2014 at 2:33 AM, Sean Silva <chisophugis at gmail.com<mailto:chisophugis at gmail.com>> wrote:
I'd just like to say that even if OCaml tools parsing JSON is out of scope as Nico suggests, the work you have done to "schematize" the Clang AST could be the start of something really useful for upstream. Currently, I can think of at least two places that would benefit greatly from having such a schema as a "single point of truth": Serialization and ASTMatchers.

Being able to auto-generate those two from a schema (maybe in the form of annotations in the header files) plus a relatively small amount of generator code could eliminate thousands of lines of code.

If RecursiveASTVisitor (~2500 lines) and TreeTransform (~10k lines) could also be generated from the "single point of truth" with relatively little code, then that would be a tremendous savings.

I think your OCaml tool would be quite easy to write with Clang annotated as such, but you could let the Clang developers maintain the annotations for you ;)

This might also pave the way for a more "data-driven" approach to the DynamicASTMatchers, which could significantly reduce the binary size (which is enormous, and IIRC is mostly due to the fact that is just pre-instantiates all the static templates). The same approach might work for the "static" ASTMatchers too, letting the compiler essentially constant-fold all the indirections (which will largely be member pointers I imagine). This might also improve compile time (which is an issue; see http://llvm.org/bugs/show_bug.cgi?id=20061<https://urldefense.proofpoint.com/v1/url?u=http://llvm.org/bugs/show_bug.cgi?id%3D20061&k=ZVNjlDMF0FElm4dQtryO4A%3D%3D%0A&r=DySJRSwIPwJgrlWcFuOjhjgW2TvEV7mDN%2BhK5RWHkOA%3D%0A&m=ahGUYFk4kISsh5qTf%2FzXoauvNBJxMH0u%2FRQtQpTMgRg%3D%0A&s=cda8f1c864387fec5f14d60863d8defe002d4f16803825e36fcf3aad2c1d0eb8>).

Generating code for the AST matchers is something we would like, but I think it's orthogonal to the general design of the matchers (we still want the functional composition), so I'm not sure how it would help with the things you mention (apart from getting rid of the manually written matchers, which would still be a big win).

I was talking about a possible simplification of the implementation, not the API it exposes to users.


One of the big problems would be how we auto-generate the documentation for the AST matchers, though. I agree that it would be better to have the documentation (and the examples) on the nodes, but that'd be a lot of work.


The documentation that you guys have produced for the matchers is quite good and could largely be reused/shared/migrated/adapted to the relevant part of the AST itself. (As you said, it would be a lot of work though).


-- Sean Silva



-- Sean Silva


On Thu, Jun 19, 2014 at 10:30 AM, Mathieu Baudet <mathieubaudet at fb.com<mailto:mathieubaudet at fb.com>> wrote:
Hi,

I am looking for feedback on the possibility of contributing some of the clang plugins used at Facebook back to clang.

We just made available a first subset of plugins here:  https://github.com/facebook/facebook-clang-plugins<https://urldefense.proofpoint.com/v1/url?u=https://github.com/facebook/facebook-clang-plugins&k=ZVNjlDMF0FElm4dQtryO4A%3D%3D%0A&r=DySJRSwIPwJgrlWcFuOjhjgW2TvEV7mDN%2BhK5RWHkOA%3D%0A&m=ahGUYFk4kISsh5qTf%2FzXoauvNBJxMH0u%2FRQtQpTMgRg%3D%0A&s=f7149a4ff8c412eceabd4f8931a1d5c734cab475367ea4308908e75349230417>

The plugins fall into two groups:
1) Clang analyzer checkers for iOS;
2) A clang frontend plugin to export the internal AST of clang in an Ocaml-friendly Json. This plugin comes with Ocaml libraries for testing, parsing, and visiting the AST.

Except for the naming conventions, which are not uniform yet, and the need to update the referenced version of clang, the code should be in a relatively good state. In particular, everything has been tested quite at scale.

Thanks!
—
Mathieu Baudet
_______________________________________________
cfe-dev mailing list
cfe-dev at cs.uiuc.edu<mailto:cfe-dev at cs.uiuc.edu>
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev<https://urldefense.proofpoint.com/v1/url?u=http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev&k=ZVNjlDMF0FElm4dQtryO4A%3D%3D%0A&r=DySJRSwIPwJgrlWcFuOjhjgW2TvEV7mDN%2BhK5RWHkOA%3D%0A&m=ahGUYFk4kISsh5qTf%2FzXoauvNBJxMH0u%2FRQtQpTMgRg%3D%0A&s=f0cfd44d8db0133c3518d14dd9cf7b68f62b2f735fa8c51d6c6ab3795d2c4b6d>


_______________________________________________
cfe-dev mailing list
cfe-dev at cs.uiuc.edu<mailto:cfe-dev at cs.uiuc.edu>
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev<https://urldefense.proofpoint.com/v1/url?u=http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev&k=ZVNjlDMF0FElm4dQtryO4A%3D%3D%0A&r=DySJRSwIPwJgrlWcFuOjhjgW2TvEV7mDN%2BhK5RWHkOA%3D%0A&m=ahGUYFk4kISsh5qTf%2FzXoauvNBJxMH0u%2FRQtQpTMgRg%3D%0A&s=f0cfd44d8db0133c3518d14dd9cf7b68f62b2f735fa8c51d6c6ab3795d2c4b6d>




-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20140624/b2a138af/attachment.html>


More information about the cfe-dev mailing list