<div dir="ltr"><div>That is a good point.</div><div><br></div><div>With the multi-repo plan, we were planning to take the existing git repositories that everyone's already using, and has work based on, and make them official.</div><div><br></div><div>However, with the single-repo plan, we'd be making a brand new git repository, with an integrated/interleaved history. As such, all the commit-hashes would be different, and even the directory layout will be different from the current git-svn repositories. And so we would "strand" all existing forks -- they'll be unable to easily pull in new changes to these repositories after the migration.<br></div><div><br></div><div>That we'll be getting incompatible history has been glossed over, and it is indeed really important to make it clear and have a good plan there. This doesn't only affect actual "forks", it also affects every single developer with a local git clone which contains unfinished work.</div><div><br></div><div><div>Therefore, we must come up with a plan to allow such users to rebase their existing work onto the new repository structure. Either documentation describing the git commands people need to run, or if it's really complicated, a script.</div></div><div><br></div><div>I don't think this is a really hard problem though -- I can think of a few ways to help existing users that probably will work (although I'd want to try them first, to ensure it actually does work, of course). The two I'm thinking of are just doing "git diff" followed by "git apply --directory=llvm" if you just want to save a patch. Or, some "git filter-branch" invocation to rename all the files in your existing repo, followed by "git rebase" (or "git merge"), if you have some more history you want to maintain.</div><div><br></div><div>To me, it seems eminently worth it to pay a one-time transition cost like that, if it makes life easier afterwards, which I believe the single-repo system would do. As long as it's documented well so not every developer needs to figure out out on their own.</div></div><div class="gmail_extra"><br><div class="gmail_quote">On Thu, Jul 21, 2016 at 2:51 AM, David Chisnall <span dir="ltr"><<a href="mailto:david.chisnall@cl.cam.ac.uk" target="_blank">david.chisnall@cl.cam.ac.uk</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span class="">On 21 Jul 2016, at 07:12, Renato Golin via llvm-dev <<a href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a>> wrote:<br>
><br>
>> I don't much care which of those is chosen. I have a slight preference for<br>
>> #1, for ease of doing things like grep/log/etc on llvm by itself, excluding<br>
>> all the other projects. But either way seems probably fine, and an<br>
>> improvement over multiple repositories.<br>
><br>
> I don't have a strong preference, but #1 proponents weakly convinced<br>
> me with two arguments:<br>
><br>
> 1. it is easier to mix-and-match repositories as you like<br>
><br>
> I'd still symlink as I do today, but I can see why this would be<br>
> interesting for off-tree users.<br>
><br>
> 2. it "makes more sense" to let Clang *use* LLVM instead of LLVM *host* Clang<br>
><br>
> this seems more preference than anything, but people that know CMake<br>
> more than I do said it would be "easier" and I trust them. I have no<br>
> technical arguments pro or against.<br>
><br>
> Though, I'd be fine with anything really.<br>
<br>
</span>First of all, thank you very much for driving this Renato. It’s a horrible task to do and I’m very grateful that you’ve taken this on.<br>
<br>
I would, however, like to add one argument against a single repo model. If you look at the current LLVM GitHub repo, GitHub is tracking 806 forks. It is tracking 595 forks for clang. Not everyone using git for downstream development has a fork on GitHub. In particular, GitHub does not allow private forks of public repos, so anyone who has a non-public git fork of LLVM will have done a git clone and a git push to their own private repo (on or off GitHub). I know of about a dozen such private repos and (for some bizarre reason) most companies don’t tell me about the secret things that they’re doing with LLVM so there are undoubtedly a lot more that I don’t know about.<br>
<br>
Conservatively, I would estimate that we have at least a thousand downstream forks of the current LLVM git repository. Moving to a single repo model with break all of them. It is completely unacceptable to break so many downstream consumers unless we are able to provide them with some coherent migration plan, but I have not seen anyone in the single-repo camp suggest anything.<br>
<span class="HOEnZb"><font color="#888888"><br>
David<br>
<br>
</font></span></blockquote></div><br></div>