[llvm-dev] [RFC] One or many git repositories?
Simon Taylor via llvm-dev
llvm-dev at lists.llvm.org
Fri Jul 22 01:16:19 PDT 2016
I’ll start by saying I’ve skimmed this thread and am not actually a user of LLVM at all, but had some git thoughts that might be worth contributing.
> On 22 Jul 2016, at 01:16, Sanjoy Das via llvm-dev <llvm-dev at lists.llvm.org> wrote:
> @David Chisnall and others with local forks: can you spot any
> potential issues with Mehdi's plan? Are there cases where it won't
One potential “issue” is that a single commit into the monolithic repository would potentially touch multiple subprojects (that’s one of the advantages). Projecting that into individual repositories would only commit changes to those files, but the commit message would be maintained and might therefore be confusing in the context of the individual repository, especially if only a small part of the commit affects that individual sub-repo.
Essentially if the projects are “supposed” to be separate modules, then submodules is the solution to enforce that independence, ensuring commits in each module only affect that module and have appropriate commit messages for that context.
If the submodules are in practice more intertwined then that then it does feel like an ideologically pure solution that in the end just gets in the way of developer productivity.
I’ve got a setup here that uses a hierarchy of submodules, so there is a “combined” submodule that just ensures that it’s children (other submodules) are at mutually compatible versions. That helped productivity (multiple consumers of the “combined” submodule don’t need to manually track versions of all the children) but this discussion is pushing me towards the thought that actually a monorepo would be a more productive solution anyway, and make more sense for cross-cutting changes.
And sorry to throw another option into the ring; and one that might already have been discussed and discounted, but thought it worth sharing.
1) Create a new llvm-project-mono repo
2) Use git subtree instead of git submodule to add all the directories to match the layout of llvm-project.
3) From now on, all commits go to the monorepo
4) monorepo commits can be projected to the individual project repos, and additionally a new commit on llvm-project can be made with the submodule version updates
- No change for existing downstream users unless they want to move to the mono view
- Easier developer experience for cross-cutting changes
- Git log by path would work identically on either view of the repository
- Hashes from before the creation of the mono repo would match in both views - the mono repo will have multiple roots but that’s not unusual with git subtree
- Step 4 from my list would need a script to keep things updated. A server-side hook would be best. The mapping is deterministic (every mono repo commit will map to one commit in any affected submodules and one “submodule update” commit in the umbrella llvm-project repo), so if the server responsible falls over the updates might be delayed but can be caught up without losing anything
- Less ideologically pure in terms of trying to keep the modules independent
- Commit hashes will diverge between the two views from the creation of the mono repo, making comparisons / merges between clones of the different views more difficult
More information about the llvm-dev