[clang-tools-extra] r187428 - cpp11-migrate: Write header replacements to disk

Tue Jul 30 15:52:42 PDT 2013

On Wed, Jul 31, 2013 at 12:26 AM, Chandler Carruth <chandlerc at google.com>wrote:

>
> On Tue, Jul 30, 2013 at 1:48 PM, Vane, Edwin <edwin.vane at intel.com> wrote:
>
>> > From: Alex Rosenberg [mailto:alexr at leftfield.org]
>> > Sent: Tuesday, July 30, 2013 4:31 PM
>> > To: Vane, Edwin
>> > Cc: Tareq A. Siraj; cfe-commits at cs.uiuc.edu
>> > Subject: Re: [clang-tools-extra] r187428 - cpp11-migrate: Write header
>> > replacements to disk
>> >
>> > YAML was included as a serialization format to be used only for
>> LLVM-internal
>> > testing tools, not as a supported output format. In fact, we discussed
>> expanding
>> > the role of YAMLTraits in order to support some of our actually used
>> output
>> > formats like plists to support generic logic in serialization.
>>
>> I'm not sure how the original intention for YAML is relevant to the
>> situation. We're making using of existing functionality in LLVM to simplify
>> a problem.
>>
>> > I'm pretty strongly against this direction.
>>
>> Why exactly?
>>
>> > I know of exactly zero tools that use YAML as a format for
>> diff/patch/rewriting.
>>
>> We don't want to target existing tools. One new tool exists so far:
>> https://github.com/revane/migmerge_git which is a prototype for a more
>> powerful tool we'll soon be building. These output files are not meant for
>> consumption by external tools (unless of course somebody wants to build
>> such a tool). They serve only as an intermediate format for data flowing
>> from >>1 C++11 Migrator processes to 1 Migration Post-Processor process via
>> the filesystem. Inter-process communication would be nice but it's not
>> easily portable.
>>
>> I've CC'd Manuel who, I think, is familiar with how Google solves this
>> problem internally with protocol buffers. I believe he suggested YAML but
>> we didn't exactly discuss the issue in detail. He might have something to
>> add from his experience.
>
>
> So, you didn't CC Manuel, and Manuel wasn't involved in the original
> discussion.
>

There was some pre-discussion on IRC, and I am actually following those
patches on a high level.

> Most of the problem here is that the original discussion was buried in a
> patch review thread that didn't really give any indication that there was a
> significant design decision being made here which would require review from
> a wider audience. I'm not sure the best way to achieve that, but I have
> some suggestions:
>
> 1) Don't start with a patch, start with just a sketch of a design. Doesn't
> have to be formal, and should be very brief.
> 2) Send the email to cfe-dev@ and put 'RFC' or something in the title.
> 3) Directly CC people you want design feedback from. I think Manuel is
> appropriate here.
>
> Anyways, Alex seems to raise an interesting design question. I don't
> actually have an opinion on it (haven't thought about it much), but I
> wanted to try to improve the process here as I suspect that lots of others
> will have thoughts on this subject within the community. A reply to a
> commit mail is the wrong place to have a detailed discussion about the
> design. cfe-dev@ is better.
>

I think the main reasons why using a structured format instead of patches
here are:
The upside of "patch" formats doesn't apply here: patches are great if you
want them to apply on potentially unknown slightly different revisions.
Otherwise I don't see big pros for patches.
On the other hand, patches have a major downside here:
a) they're too coarse - they're usually line based; refactoring tools often
need to handle replacements that fix stuff (non-overlapping) on the same
line
b) they're hard to "work with" other than applying them; patch libraries
usually just give you that, so if you want to extract the exact ranges on
what is changing (for example to do deduplication, or to do other location
based calculations) you're re-writing your own patch-parsing code
c) they're much harder to generate from a clang-tool;
offset+length+replacement is something you just "have", no need to
construct the actual end-result and run algorithms over the strings in the
tool

To me patches are basically an optional 3rd level interface here, but we
need something more fine grained in a structured format. Using YAML (or
JSON) seems an obvious choice for that in the llvm context.

Cheers,
/Manuel
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20130731/c866aa11/attachment.html>