[cfe-dev] C AST transformations / questionable use of AST serialization
Manuel Klimek
klimek at google.com
Wed Jun 11 23:18:58 PDT 2014
On Thu, Jun 12, 2014 at 12:10 AM, Mark Tullsen <tullsen at galois.com> wrote:
> Hi,
>
> We've been building a tool (eventually to be released BSD) for allowing
> programmers to write custom program properties (complexity, semantic,
> architectural, etc.) in a high level DSL (embedded in Haskell at the
> moment).
> We switched from a decent but ad hoc C99 parser to using the Clang front
> end and
> are very happy customers. We are using the libclang C interface via FFI.
>
> However, we lost one extremely useful capability in this transition. We
> had
> some really nice one-liners in our pre-clang days, e.g.,
>
> property1 = noUnreachableCode . removeDecls (hasPrefix "test_")
>
> // Remove all the declarations for test code from the project then
> // test to see if there is no unreachable code
>
> property2 = noUnreachableCode
> . removeDecls (hasPrefix "test_")
> . removeMembersFromStructs (hasPrefix "test_")
>
> // Ditto, but we also remove structure members that are only there
> // for testing purposes.
>
> If you don't grok Haskell:
> - The '.' above is function composition (like '|' in Unix)
> - removeDecls, removeMembersFromStructs, hasPrefix are higher order
> functions.
>
> With our switch to clang, we have lost the ability to do quick and easy
> wholesale project transformations like the above removeDecls function. We
> also
> have the need to do transformations that add to the code (e.g., inserting
> attributes). The output of these transformations (code slicing,
> mutations, extensions) may be
> only be for intermediate use and are not necessarily output for the sake
> of code refactoring.
>
> I'd really like to regain the ability to achieve such transformations. As
> we explore
> ways to do this, these are some of my thoughts:
>
> - These modules
>
> Refactoring.h - Framework for clang refactoring tools
> Rewriter.h - Code rewriting interface
>
> seem to be designed for applying changes to the source and cannot
> be readily used to modify the AST (nor the serialized form of the AST).
>
> Correct?
>
Yes.
> - One approach I'm considering is to write a custom encoder/decoder for
> the
> serialized AST for our Haskell code. I.e., porting the
> clang::serialization
> stuff to Haskell so that we can read and write .ast files.
>
> I saw some long past post to this list that discouraged this.
> But my question is not so much whether you think (as C++ coders) this
> is the *preferable* way,
> but
>
> IF someone is really keen for a 3rd party (non C++) tool to
> transform the AST
>
> - Is the above replace-serialization approach even feasible?
>
I think it's feasible, but see below ;)
> - Any warnings/suggestions if we did try this?
>
- the AST is huge and changes somewhat frequently (not so much in itself,
but new AST nodes are introduced, etc); this might not be a big problem for
you if you only care about C, but it might lead to non-trivial maintenance
effort for the tool
- the AST invariants are hard to get right
In the end it's of course a cost-benefit trade-off. My best guess is that
it's usually not worth to try to maintain an adapted out-of-tree
serialization framework for clang's AST, but YMMV.
> - Are there alternative ways to do this that don't involve applying
> rewrites to the source and re-parsing?
>
I'm not aware.
>
> Sorry for the long post. Any insights or guidance would be very helpful!
>
> - Mark Tullsen
>
>
>
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20140612/47660bde/attachment.html>
More information about the cfe-dev
mailing list