<div dir="ltr">On Wed, Jul 31, 2013 at 7:20 PM, Du Toit, Stefanus <span dir="ltr"><<a href="mailto:stefanus.du.toit@intel.com" target="_blank">stefanus.du.toit@intel.com</a>></span> wrote:<br><div class="gmail_extra">

<div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">From:  Manuel Klimek <<a href="mailto:klimek@google.com">klimek@google.com</a>>:<br>

<div class="im">><br>

> On Wed, Jul 31, 2013 at 6:33 PM, Du Toit, Stefanus<br>

> <<a href="mailto:stefanus.du.toit@intel.com">stefanus.du.toit@intel.com</a>> wrote:<br>

><br>

> > On Wed, Jul 31, 2013 at 5:40 PM, Vane, Edwin<br>

><<a href="mailto:edwin.vane@intel.com">edwin.vane@intel.com</a><mailto:<a href="mailto:edwin.vane@intel.com">edwin.vane@intel.com</a>>> wrote:<br>

> > -----Original Message-----<br>

> > From: Vane, Edwin<br>

> > Sent: Wednesday, July 31, 2013 11:40 AM<br>

> ><br>

> > To: Clang Dev List (<a href="mailto:cfe-dev@cs.uiuc.edu">cfe-dev@cs.uiuc.edu</a><mailto:<a href="mailto:cfe-dev@cs.uiuc.edu">cfe-dev@cs.uiuc.edu</a>>)<br>

> > Subject: RFC: YAML as an intermediate format for<br>

>clang::tooling::Replacement data on disk<br>

> ><br>

> > > Hi all,<br>

> > ><br>

> > > This discussion began on cfe-commits as the result of a commit<br>

>(Tareq's poor header replacement patch that keeps getting reverted due to<br>

>Windows build-bot issues). The start of the thread is here if you want<br>

>background info:<br>

> > ><br>

> > ><br>

><a href="http://lists.cs.uiuc.edu/pipermail/cfe-commits/Week-of-Mon-20130729/084881" target="_blank">http://lists.cs.uiuc.edu/pipermail/cfe-commits/Week-of-Mon-20130729/084881</a><br>

>.html<br>

</div>><<a href="http://lists.cs.uiuc.edu/pipermail/cfe-commits/Week-of-Mon-20130729/08488" target="_blank">http://lists.cs.uiuc.edu/pipermail/cfe-commits/Week-of-Mon-20130729/08488</a><br>

>1.html>.<br>

<div class="im">> > ><br>

> > > The proposal: The C++11 Migrator has a need to write Replacement<br>

>data: offset, length, and replacement text, to disk. The replacement data<br>

>describes changes made to a header while transforming one or more TU's.<br>

>All the replacement data would be gathered up<br>

> > > after an entire code-base is transformed by a separate tool and<br>

>merged together to produce actual changes to headers. So the point is to<br>

>serialize Replacement data as a form of inter-process communication using<br>

>the file system as the communication link. Real<br>

> > > inter-process communication is a possibility but not portable.<br>

><br>

> > I have to wonder whether it's not easier to just ensure that headers<br>

>are only transformed once.<br>

> ><br>

> > I understand there's the issue of deciding what compiler flags to use<br>

>when processing a header. My thoughts on that:<br>

> > * For some projects, there aren't any per-file compiler flags, so it<br>

>would be sufficient to just pass a general set of flags to the tool on<br>

>the command line (e.g., with made up parameter syntax, something like<br>

>'cpp11-migrate *.h ‹compile-flags="-I../include<br>

</div>> > DFOO"'Š)<br>

<div class="im">> > * For other projects, a simple heuristic of matching<br>

</div>>"foo.{cpp,cc,cxx,Š}" to "foo.{h,hh,hpp,Š}" might be enough (lots of<br>

<div class="im">>details to sort out here, like how to specify the directory structure,<br>

>but hopefully you get the idea)<br>

> > * For more complicated cases, one could add (whether manually or using<br>

>some tool) entries to the compilation database for header files<br>

> ><br>

> > With that in mind, why not treat header files like source files and<br>

>process them separately?<br>

><br>

> How do you propose to treat template instantiations?<br>

><br>

> For example:<br>

> a.h:<br>

> template <typename T> class A { void x(T t) { t.y(); }}<br>

><br>

> x.cc:<br>

> A<C> a; a.x();<br>

><br>

> Imagine we want to change C::y -> C::z. Now depending on which types A<br>

>is instantiated with, it might be totally safe to refactor t.y() in A or<br>

>not. So there needs to be a postprocessing step that figures that out<br>

>anyway.<br>

<br>

</div>Given a class like:<br>

<br>

template<typename T><br>

class MyVector {<br>

  MyVector() : m_begin(0), m_end(0) {}<br>

  MyVector(std::size_t size) : m_begin(new T[size]), m_end(m_begin + size)<br>

{}<br>

<br>

  // ...<br>

<br>

private:<br>

  T* m_begin;<br>

  T* m_end;<br>

};<br>

<br>

I would love for cpp11-migrate to be able to turn that into something like:<br>

<br>

template<typename T><br>

class MyVector {<br>

  MyVector() = default;<br>

  MyVector(std::size_t size) : m_begin(new T[size]), m_end(m_begin + size)<br>

{}<br>

<br>

  // ...<br>

<br>

<br>

private:<br>

  T* m_begin = nullptr;<br>

  T* m_end = nullptr;<br>

};<br>

<br>

And I contend that it doesn't need to know what MyVector could be<br>

instantiated with in order to do that.<br></blockquote><div><br></div><div>I'm not sure what an example of a change that doesn't need to know the instantiations has to do with it. I fully agree that those exist :) I'm merely proposing that the others exist, too.</div>

<div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

Now, I totally understand that I may be asking for something very<br>

difficult to do in Clang today. In which case, I'll accept whatever<br>

limitations there are. But I don't think that assuming we'll see the full<br>

set of instantiations is a great solution either.<br></blockquote><div><br></div><div>I think it'll be a necessary solution for many refactoring tools.</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

<div class="im">

> > If the issue is parallel compilation, deferring the replacements makes<br>

>perfect sense as a way to resolve any read-write conflicts (transforming<br>

>one header while it's being parsed as part of another TU). However, if<br>

>you ensure that a header isn't touched by<br>

> > multiple transformations, and generally ensure that transformations<br>

>don't clobber each other by design, there's no need to merge anything.<br>

> ><br>

> > Personally I would even accept a slightly more limited set of<br>

>transformations in exchange for never having to worry about merging.<br>

><br>

><br>

> "Merging" is usually merely deduplication, which is not hard. I have the<br>

>feeling that you think there's lots of complexity where it isn't. I'd<br>

>definitely say that the heuristics you propose in order to be able to<br>

>process headers on their own are much higher<br>

> than the issue of deduplicating edits.<br>

<br>

</div>If it's just deduplication, it's no big deal - totally agreed.<br>

<br>

I may have been thrown off by this: <a href="https://github.com/revane/migmerge_git" target="_blank">https://github.com/revane/migmerge_git</a><br>

<br>

That tool seems to (aim to) handle a lot more than deduplication.<br>

<br>

More importantly: I certainly hope it's not going to be necessary to<br>

download a separate "merge tool" in order to use cpp11-migrate.<br></blockquote><div><br></div><div>I actually agree that it makes sense to have cpp11-migrate work on single TUs (or multiple TUs) without the need for an extra tool, by just running in process over all the TUs and applying all the changes.</div>

<div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

To me, cpp11-migrate is the kind of tool I'm just not going to use if it's<br>

not dead simple to use. As a user, I don't really care how it works<br>

internally, and I don't want to care :). I _am_ willing to tell it (a<br>

reasonable amount of) things about my code base that it can't easily infer<br>

itself.<br></blockquote><div><br></div><div>cpp11-migrate is something that I'd expect most users to run *once* in their life, over a whole code-base. I don't see that the extra overhead of running a special tool would be too much effort. I also think it'd be great if it still worked without the extra tool (as noted above), but I don't think it's a necessity.</div>

<div><br></div><div>Cheers,</div><div>/Manuel</div><div><br></div></div></div></div>