[cfe-dev] AST transformations
Alek Paunov
alex at declera.com
Sun Mar 13 11:49:11 PDT 2011
Hi Vassil,
On 12.03.2011 18:41, Vassil Vassilev wrote:
> On 12.3.2011 г. 19:14, Siegfried Rohdewald wrote:
>> Vassil Vassilev<vasil.georgiev.vasilev at ...> writes:
>>
>>>> Just a question in that direction: I am thinking about .ast -> DB ->
>>>> Transformations -> DB -> .ast
>>>>
>>>> Is it possible/reasonable idea ?
>>> Sorry for the stupid question but, what does DB stands for?
>> Database. That lets you use a database schema for the AST.
> I would have guess that but it seemed a bit strange to import the ast in
> a database...
> I still don't understand. It is possible but the question is what would
> be the advantages of that? I guess you want to use a database schema for
> cascade deletion...
As .ast I mean BC encoded serialization produced from the ASTWriter
(clang -emit-ast).
In the past ten years, there are several projects following this (AST in
DB) approach - two successful samples:
* JTransformer [1] - open source, Eclipse plugin, based on SWI Prolog
(the DB is a standard prolog fact store + indexes)
* SemmleCode/.QL [2] - closed source, complies to SQL
For the proof of concept attempt, my proposal for DB/Query Language
would be Berkeley DBXML/XQuery because:
* XQuery, naturally operates on subtrees
* Further, I think that some more specialized language (like TXL or
Stratego) can be compiled to XQuery.
* In this JunGL paper [3], the author states that his language is near
(in terms of necessary characteristics) to XQuery.
* XQuery is W3C Standard, there are many implementations, I personally
think that for very large code bases, the right engine will be something
based on Pathfinder [4]
As Siegfried said, the DB schema (XML schema/Relax NG in XMLDB case) can
help for validation of DB import and/or the state of the trees after
some transformation processing - this comes out-of-the box, but I am
afraid that for full/sound validation, we will need to write additional
modules in XQuery (because of need of semantic checks for refs between
the nodes at least)
> And how you would do the transformations in the database? Can you give
> us more details on what you want to do?
I see two forms of transformations:
* In-place, using XQuery Update
* Projections using (often recursive) "constructor" functions: insert
nodes your-module:ProjFunc1($base-node, $args) into $node)
Advantages:
* Stratego (or even "low level" :-) XQuery) transformation can be
sketched from almost everyone in several hours - equivalent (let's say
final) TreeTransform based one will cost at least days for well trained
in LLVM/Clang developer.
* Relatively easy (customized) unparsing and other query based DB
results, like stable C++ XML representation [*] for example.
What you think?
Kind regards,
Alek
[*] Douglas Gregor often says that the XML export of CLang AST need to
be in standardized (to stable, not so parallel to current Clang ASTs)
schema. I think that this is perfect goal, but can be achieved and
supported more easy via XML->XML transformation of native Clang X.Y
schema (using XSLT or XQuery), compared to C tree -> XML transformation
mixed in phase of XML printing (using C++).
[1]
http://sewiki.iai.uni-bonn.de/research/jtransformer/api/java/pefs/2.9/java_pef_overview
[2] http://en.wikipedia.org/wiki/SemmleCode
[3]
http://research.microsoft.com/pubs/79030/DPhil%20Thesis%20-%20Mathieu%20Verbaere.pdf
[4] http://www-db.informatik.uni-tuebingen.de/research/pathfinder
More information about the cfe-dev
mailing list