[cfe-dev] C AST transformations / questionable use of AST serialization

Manuel Klimek klimek at google.com
Wed Jun 11 23:18:58 PDT 2014


On Thu, Jun 12, 2014 at 12:10 AM, Mark Tullsen <tullsen at galois.com> wrote:

> Hi,
>
> We've been building a tool (eventually to be released BSD) for allowing
> programmers to write custom program properties (complexity, semantic,
> architectural, etc.) in a high level DSL (embedded in Haskell at the
> moment).
> We switched from a decent but ad hoc C99 parser to using the Clang front
> end and
> are very happy customers.  We are using the libclang C interface via FFI.
>
> However, we lost one extremely useful capability in this transition.  We
> had
> some really nice one-liners in our pre-clang days, e.g.,
>
>   property1 = noUnreachableCode . removeDecls (hasPrefix "test_")
>
>     // Remove all the declarations for test code from the project then
>     // test to see if there is no unreachable code
>
>   property2 = noUnreachableCode
>             . removeDecls (hasPrefix "test_")
>             . removeMembersFromStructs (hasPrefix "test_")
>
>     // Ditto, but we also remove structure members that are only there
>     // for testing purposes.
>
> If you don't grok Haskell:
>   - The '.' above is function composition (like '|' in Unix)
>   - removeDecls, removeMembersFromStructs, hasPrefix are higher order
> functions.
>
> With our switch to clang, we have lost the ability to do quick and easy
> wholesale project transformations like the above removeDecls function.  We
> also
> have the need to do transformations that add to the code (e.g., inserting
> attributes).  The output of these transformations (code slicing,
> mutations, extensions) may be
> only be for intermediate use and are not necessarily output for the sake
> of code refactoring.
>
> I'd really like to regain the ability to achieve such transformations.  As
> we explore
> ways to do this, these are some of my thoughts:
>
>   - These modules
>
>        Refactoring.h - Framework for clang refactoring tools
>        Rewriter.h - Code rewriting interface
>
>     seem to be designed for applying changes to the source and cannot
>     be readily used to modify the AST (nor the serialized form of the AST).
>
>     Correct?
>

Yes.


>   - One approach I'm considering is to write a custom encoder/decoder for
> the
>     serialized AST for our Haskell code.  I.e., porting the
> clang::serialization
>     stuff to Haskell so that we can read and write .ast files.
>
>     I saw some long past post to this list that discouraged this.
>     But my question is not so much whether you think (as C++ coders) this
> is the *preferable* way,
>     but
>
>       IF someone is really keen for a 3rd party (non C++) tool to
> transform the AST
>
>        - Is the above replace-serialization approach even feasible?
>

I think it's feasible, but see below ;)


>        - Any warnings/suggestions if we did try this?
>

- the AST is huge and changes somewhat frequently (not so much in itself,
but new AST nodes are introduced, etc); this might not be a big problem for
you if you only care about C, but it might lead to non-trivial maintenance
effort for the tool
- the AST invariants are hard to get right

In the end it's of course a cost-benefit trade-off. My best guess is that
it's usually not worth to try to maintain an adapted out-of-tree
serialization framework for clang's AST, but YMMV.


>        - Are there alternative ways to do this that don't involve applying
>          rewrites to the source and re-parsing?
>

I'm not aware.


>
> Sorry for the long post.  Any insights or guidance would be very helpful!
>
> - Mark Tullsen
>
>
>
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20140612/47660bde/attachment.html>


More information about the cfe-dev mailing list