[llvm-dev] [GSoC 2017] Clang-based diff tool project

Mehdi Amini via llvm-dev llvm-dev at lists.llvm.org
Mon Mar 20 16:47:34 PDT 2017


(+CC: Greg Clayton who gave me this idea in the first place)

> On Mar 20, 2017, at 3:20 PM, Johannes Altmanninger via llvm-dev <llvm-dev at lists.llvm.org> wrote:
> 
> Hello,
> 
> I am currently studying Computer Science at TU Eindhoven. I am doing a
> course that involves programming assignments on parts of LLVM such as
> lowering, scheduling and optimization. For this year's Google Summer of
> Code I plan to submit a proposal to implement a clang-based diff tool
> [1].

Great! I look forward to see this :)

> 
> I think it really pays off to have decent developer tools available, as
> they can save tons of time. Clang tooling has obviously been very
> successful.  I think it would be a good idea to develop a diff tool that
> considers the structure of the code, as opposed to just the lines. Plain
> old diff only thinks in terms of "additions" and "deletions", although
> it would be more natural to also consider "updates" and "moves".
> 
> So a structural diff would work solely on the AST, hence formatting
> changes are ignored. It would allow to highlight the exact location of a
> change, and not a whole line. Furthermore, it would allow to compare
> pieces of code with the same structure (think subclasses).
> 
> Besides some papers with clever AST-matching algorithms, a quick web
> search yielded [2], which is a proof-of-concept implementation of a
> structural comparison algorithm.  I think it demonstrates rather nicely
> what could be done: movement of chunks of code can be easily traced.
> 
> Anyway, one could make all kinds of nice visualizations using a AST diff
> tool, however, I think the initial focus should probably be on creating
> one with a similar output to traditional diff, with the difference that
> updates and moves are displayed in a easily readable way, which already
> could improve developer productivity and happiness.
> 
> As of now I have one question: The output of the tool is meant just for
> humans to read (and not for actual patching), right?

Yes. But we developed software as libraries usually. Practically I expect the main part of the work to write some piece of API that generate an “in-memory” representation of the diff.

A tool that is generating a textual-human readable output is likely the first client of this API and is likely critical to be able to functionally test it in the early development. In the future I hope it’d enable other graphical diff client to plug-in, or git-merge resolution tools as well.

Best,

— 
Mehdi



More information about the llvm-dev mailing list