[cfe-dev] [RFC] [GSoC] AST differencing

Johannes Altmanninger via cfe-dev cfe-dev at lists.llvm.org
Fri Jun 23 09:21:39 PDT 2017


Manuel Klimek <klimek at google.com> writes:

> Hi Johannes,
>
> I think this is a great area of research. For Java, for example, refaster (
> http://errorprone.info/docs/refaster) exists which lets you express simple
> code changes with "before / after" samples. If we can get something like
> this based on Clang's AST, that would be most awesome - is this a direction
> you're working towards?

Hi Manuel,

Yes, this is definitely something that can be done, I will design for
that also. I am in the process of modularizing the API, so that it will
allow this kind of pattern matching.

So the main advantage of such template based refactoring rules is that
they are easy to write (compared to an AST-matchers based solution). I
think it is a very nice way of writing refactoring, I am working on a
prototype.

For the AST matching / diffing API I am thinking of getting rid of the
custom Node class and instead just let the user pass Decl or Expr
nodes. Also, it should be possible to pass a custom predicate for
matching nodes, this seems to be the best way to enable exact pattern
based matches - as opposed to the gumtree algorithm, which is based on a
heuristic.

Johannes

>
> Cheers,
> /Manuel
>
> On Thu, Jun 22, 2017 at 2:50 AM Johannes Altmanninger via cfe-dev <
> cfe-dev at lists.llvm.org> wrote:
>
>> Hi,
>>
>> I have been working on creating a syntax tree differencing library.
>> So far, my focus has been on getting the core algorithm right, some
>> parts are missing.
>>
>> I have submitted a patch [1]. I already received some great feedback and
>> I am always open to suggestions for improvement.
>>
>> Additionally, if you have ideas to discuss about this, please let me
>> know, then I can keep it in mind while designing.
>>
>> I think it should be fairly easy to create a tool that uses the same
>> output format as diff while only showing semantic changes. More advanced
>> tools seem to be quite possible also.
>>
>> One difficulty I am facing so far is that extracting values of
>> individual AST nodes requires me to manually combine all relevant
>> attributes. The value of a node is all things that make it semantically
>> different from other nodes of the same kind, except for its children.
>>
>> After an edit script has been computed it would be possible to apply it,
>> but in order such an edit script it will be necessary to implement a way
>> to do the opposite of extracting node information: updating the source
>> code based on the previously extracted information (node value). It is
>> probably not particularly difficult but it will require qutie some
>> effort to do it in a way that it can work for any node.
>>
>>
>> Johannes
>>
>> [1] https://reviews.llvm.org/D34329
>> _______________________________________________
>> cfe-dev mailing list
>> cfe-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>>



More information about the cfe-dev mailing list