[llvm-dev] [RFC] One or many git repositories?
Mehdi Amini via llvm-dev
llvm-dev at lists.llvm.org
Fri Jul 22 16:46:24 PDT 2016
> On Jul 22, 2016, at 4:42 PM, Justin Lebar <jlebar at google.com> wrote:
>> 2) Use git subtree instead of git submodule to add all the directories to match the layout of llvm-project.
> This loses unified history, yes?
It does not “loses” unified history, it preserves the existing git history. I understand what you mean but I disagree with your choice of words :)
By rewriting the history you *add* a new feature (not supported by the official git repo) which is this “unified history”.
> That is, I cannot go back in time
> and check out a git single revision that corresponds to a single SVN
> revision? This is a key property, I think. It's what lets us bisect,
> for example.
>> This is what I proposed except I’m not using subtree but an explicit move commit in the existing repo before merging them
> Looking at https://github.com/joker-eph/llvm-unified/commits/master,
> this repository seems to lack the property that you can go back in
> time and check out a git revision that corresponds to a single SVN
> Given that we have the capability  to port branches onto a version
> of the monorepo that has unified history, I think using such a
> repository is strongly preferable to the alternative. The alternative
> may make it easier to port old branches, but that's a one-time cost.
> We'll pay the cost of not having ununified history forever.
Which cost? Do you still bisect revisions from a few years ago?
>  https://github.com/jlebar/llvm-port-commits
> On Fri, Jul 22, 2016 at 4:35 PM, Mehdi Amini via llvm-dev
> <llvm-dev at lists.llvm.org> wrote:
>>> On Jul 22, 2016, at 1:16 AM, Simon Taylor via llvm-dev <llvm-dev at lists.llvm.org> wrote:
>>> Hi all,
>>> I’ll start by saying I’ve skimmed this thread and am not actually a user of LLVM at all, but had some git thoughts that might be worth contributing.
>>>> On 22 Jul 2016, at 01:16, Sanjoy Das via llvm-dev <llvm-dev at lists.llvm.org> wrote:
>>>> @David Chisnall and others with local forks: can you spot any
>>>> potential issues with Mehdi's plan? Are there cases where it won't
>>> One potential “issue” is that a single commit into the monolithic repository would potentially touch multiple subprojects (that’s one of the advantages). Projecting that into individual repositories would only commit changes to those files, but the commit message would be maintained and might therefore be confusing in the context of the individual repository, especially if only a small part of the commit affects that individual sub-repo.
>>> Essentially if the projects are “supposed” to be separate modules, then submodules is the solution to enforce that independence, ensuring commits in each module only affect that module and have appropriate commit messages for that context.
>>> If the submodules are in practice more intertwined then that then it does feel like an ideologically pure solution that in the end just gets in the way of developer productivity.
>>> I’ve got a setup here that uses a hierarchy of submodules, so there is a “combined” submodule that just ensures that it’s children (other submodules) are at mutually compatible versions. That helped productivity (multiple consumers of the “combined” submodule don’t need to manually track versions of all the children) but this discussion is pushing me towards the thought that actually a monorepo would be a more productive solution anyway, and make more sense for cross-cutting changes.
>>> And sorry to throw another option into the ring; and one that might already have been discussed and discounted, but thought it worth sharing.
>>> 1) Create a new llvm-project-mono repo
>>> 2) Use git subtree instead of git submodule to add all the directories to match the layout of llvm-project.
>>> 3) From now on, all commits go to the monorepo
>>> 4) monorepo commits can be projected to the individual project repos, and additionally a new commit on llvm-project can be made with the submodule version updates
>> This is what I proposed except I’m not using subtree but an explicit move commit in the existing repo before merging them: https://github.com/joker-eph/llvm-unified
>> The reason I didn’t go with subtree merging is that it breaks `git log --follow path/to/file`. I suspect not many tools (blame history in a text editor) are supporting the subtree metadata in the merge commit.
>> Do you any drawback to what I did instead?
>>> - No change for existing downstream users unless they want to move to the mono view
>>> - Easier developer experience for cross-cutting changes
>>> - Git log by path would work identically on either view of the repository
>>> - Hashes from before the creation of the mono repo would match in both views - the mono repo will have multiple roots but that’s not unusual with git subtree
>>> - Step 4 from my list would need a script to keep things updated. A server-side hook would be best. The mapping is deterministic (every mono repo commit will map to one commit in any affected submodules and one “submodule update” commit in the umbrella llvm-project repo), so if the server responsible falls over the updates might be delayed but can be caught up without losing anything
>>> - Less ideologically pure in terms of trying to keep the modules independent
>>> - Commit hashes will diverge between the two views from the creation of the mono repo, making comparisons / merges between clones of the different views more difficult
>>> LLVM Developers mailing list
>>> llvm-dev at lists.llvm.org
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
More information about the llvm-dev