<div dir="ltr">On Wed, Jul 31, 2013 at 7:20 PM, Du Toit, Stefanus <span dir="ltr"><<a href="mailto:stefanus.du.toit@intel.com" target="_blank">stefanus.du.toit@intel.com</a>></span> wrote:<br><div class="gmail_extra">
<div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">From: Manuel Klimek <<a href="mailto:klimek@google.com">klimek@google.com</a>>:<br>
<div class="im">><br>
> On Wed, Jul 31, 2013 at 6:33 PM, Du Toit, Stefanus<br>
> <<a href="mailto:stefanus.du.toit@intel.com">stefanus.du.toit@intel.com</a>> wrote:<br>
><br>
> > On Wed, Jul 31, 2013 at 5:40 PM, Vane, Edwin<br>
><<a href="mailto:edwin.vane@intel.com">edwin.vane@intel.com</a><mailto:<a href="mailto:edwin.vane@intel.com">edwin.vane@intel.com</a>>> wrote:<br>
> > -----Original Message-----<br>
> > From: Vane, Edwin<br>
> > Sent: Wednesday, July 31, 2013 11:40 AM<br>
> ><br>
> > To: Clang Dev List (<a href="mailto:cfe-dev@cs.uiuc.edu">cfe-dev@cs.uiuc.edu</a><mailto:<a href="mailto:cfe-dev@cs.uiuc.edu">cfe-dev@cs.uiuc.edu</a>>)<br>
> > Subject: RFC: YAML as an intermediate format for<br>
>clang::tooling::Replacement data on disk<br>
> ><br>
> > > Hi all,<br>
> > ><br>
> > > This discussion began on cfe-commits as the result of a commit<br>
>(Tareq's poor header replacement patch that keeps getting reverted due to<br>
>Windows build-bot issues). The start of the thread is here if you want<br>
>background info:<br>
> > ><br>
> > ><br>
><a href="http://lists.cs.uiuc.edu/pipermail/cfe-commits/Week-of-Mon-20130729/084881" target="_blank">http://lists.cs.uiuc.edu/pipermail/cfe-commits/Week-of-Mon-20130729/084881</a><br>
>.html<br>
</div>><<a href="http://lists.cs.uiuc.edu/pipermail/cfe-commits/Week-of-Mon-20130729/08488" target="_blank">http://lists.cs.uiuc.edu/pipermail/cfe-commits/Week-of-Mon-20130729/08488</a><br>
>1.html>.<br>
<div class="im">> > ><br>
> > > The proposal: The C++11 Migrator has a need to write Replacement<br>
>data: offset, length, and replacement text, to disk. The replacement data<br>
>describes changes made to a header while transforming one or more TU's.<br>
>All the replacement data would be gathered up<br>
> > > after an entire code-base is transformed by a separate tool and<br>
>merged together to produce actual changes to headers. So the point is to<br>
>serialize Replacement data as a form of inter-process communication using<br>
>the file system as the communication link. Real<br>
> > > inter-process communication is a possibility but not portable.<br>
><br>
> > I have to wonder whether it's not easier to just ensure that headers<br>
>are only transformed once.<br>
> ><br>
> > I understand there's the issue of deciding what compiler flags to use<br>
>when processing a header. My thoughts on that:<br>
> > * For some projects, there aren't any per-file compiler flags, so it<br>
>would be sufficient to just pass a general set of flags to the tool on<br>
>the command line (e.g., with made up parameter syntax, something like<br>
>'cpp11-migrate *.h ‹compile-flags="-I../include<br>
</div>> > DFOO"'Š)<br>
<div class="im">> > * For other projects, a simple heuristic of matching<br>
</div>>"foo.{cpp,cc,cxx,Š}" to "foo.{h,hh,hpp,Š}" might be enough (lots of<br>
<div class="im">>details to sort out here, like how to specify the directory structure,<br>
>but hopefully you get the idea)<br>
> > * For more complicated cases, one could add (whether manually or using<br>
>some tool) entries to the compilation database for header files<br>
> ><br>
> > With that in mind, why not treat header files like source files and<br>
>process them separately?<br>
><br>
> How do you propose to treat template instantiations?<br>
><br>
> For example:<br>
> a.h:<br>
> template <typename T> class A { void x(T t) { t.y(); }}<br>
><br>
> x.cc:<br>
> A<C> a; a.x();<br>
><br>
> Imagine we want to change C::y -> C::z. Now depending on which types A<br>
>is instantiated with, it might be totally safe to refactor t.y() in A or<br>
>not. So there needs to be a postprocessing step that figures that out<br>
>anyway.<br>
<br>
</div>Given a class like:<br>
<br>
template<typename T><br>
class MyVector {<br>
MyVector() : m_begin(0), m_end(0) {}<br>
MyVector(std::size_t size) : m_begin(new T[size]), m_end(m_begin + size)<br>
{}<br>
<br>
// ...<br>
<br>
private:<br>
T* m_begin;<br>
T* m_end;<br>
};<br>
<br>
I would love for cpp11-migrate to be able to turn that into something like:<br>
<br>
template<typename T><br>
class MyVector {<br>
MyVector() = default;<br>
MyVector(std::size_t size) : m_begin(new T[size]), m_end(m_begin + size)<br>
{}<br>
<br>
// ...<br>
<br>
<br>
private:<br>
T* m_begin = nullptr;<br>
T* m_end = nullptr;<br>
};<br>
<br>
And I contend that it doesn't need to know what MyVector could be<br>
instantiated with in order to do that.<br></blockquote><div><br></div><div>I'm not sure what an example of a change that doesn't need to know the instantiations has to do with it. I fully agree that those exist :) I'm merely proposing that the others exist, too.</div>
<div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
Now, I totally understand that I may be asking for something very<br>
difficult to do in Clang today. In which case, I'll accept whatever<br>
limitations there are. But I don't think that assuming we'll see the full<br>
set of instantiations is a great solution either.<br></blockquote><div><br></div><div>I think it'll be a necessary solution for many refactoring tools.</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div class="im">
> > If the issue is parallel compilation, deferring the replacements makes<br>
>perfect sense as a way to resolve any read-write conflicts (transforming<br>
>one header while it's being parsed as part of another TU). However, if<br>
>you ensure that a header isn't touched by<br>
> > multiple transformations, and generally ensure that transformations<br>
>don't clobber each other by design, there's no need to merge anything.<br>
> ><br>
> > Personally I would even accept a slightly more limited set of<br>
>transformations in exchange for never having to worry about merging.<br>
><br>
><br>
> "Merging" is usually merely deduplication, which is not hard. I have the<br>
>feeling that you think there's lots of complexity where it isn't. I'd<br>
>definitely say that the heuristics you propose in order to be able to<br>
>process headers on their own are much higher<br>
> than the issue of deduplicating edits.<br>
<br>
</div>If it's just deduplication, it's no big deal - totally agreed.<br>
<br>
I may have been thrown off by this: <a href="https://github.com/revane/migmerge_git" target="_blank">https://github.com/revane/migmerge_git</a><br>
<br>
That tool seems to (aim to) handle a lot more than deduplication.<br>
<br>
More importantly: I certainly hope it's not going to be necessary to<br>
download a separate "merge tool" in order to use cpp11-migrate.<br></blockquote><div><br></div><div>I actually agree that it makes sense to have cpp11-migrate work on single TUs (or multiple TUs) without the need for an extra tool, by just running in process over all the TUs and applying all the changes.</div>
<div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
To me, cpp11-migrate is the kind of tool I'm just not going to use if it's<br>
not dead simple to use. As a user, I don't really care how it works<br>
internally, and I don't want to care :). I _am_ willing to tell it (a<br>
reasonable amount of) things about my code base that it can't easily infer<br>
itself.<br></blockquote><div><br></div><div>cpp11-migrate is something that I'd expect most users to run *once* in their life, over a whole code-base. I don't see that the extra overhead of running a special tool would be too much effort. I also think it'd be great if it still worked without the extra tool (as noted above), but I don't think it's a necessity.</div>
<div><br></div><div>Cheers,</div><div>/Manuel</div><div><br></div></div></div></div>