[llvm-dev] [RFC] One or many git repositories?

Mehdi Amini via llvm-dev llvm-dev at lists.llvm.org
Sun Jul 24 10:46:30 PDT 2016


> On Jul 22, 2016, at 12:51 AM, Chandler Carruth via llvm-dev <llvm-dev at lists.llvm.org> wrote:
> 
> I wanted to present some of the particular reasons why I'm pretty strongly opposed to a purely flat layout of projects the way the current github 'llvm-project' repository looks, as that hasn't happened on the list yet. I'm replying to myself as I don't see a much better place to hang that conversation.
> 
> On Wed, Jul 20, 2016 at 7:38 PM Chandler Carruth <chandlerc at google.com <mailto:chandlerc at google.com>> wrote:
> On Wed, Jul 20, 2016 at 7:08 PM James Y Knight via llvm-dev <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote:
> Should the layout in the merged repository be:
> 1) Like the "llvm-project" git repository is now:
> 
> <root>/llvm/
> <root>/clang/
> <root>/compiler-rt
> ...
> 
> 2) Like the "ideal merged checkout" is now:
> llvm/
> llvm/tools/clang
> llvm/projects/compiler-rt
> ...
> 
> 
> I don't much care which of those is chosen. I have a slight preference for #1, for ease of doing things like grep/log/etc on llvm by itself, excluding all the other projects. But either way seems probably fine, and an improvement over multiple repositories.
> 
> FWIW, I strongly prefer #2, but I think the high order bit is the repository question.
> 
> So, a reasonable question might be, why do I prefer #2?
> I have a lot of not terribly connected reasons.
> 
> First, I want to consider what happens if we go with #1. Today, LLVM subprojects have been formed essentially any time it was conducive to do so. This worked around the subversion sparse checkout challenges (arguably also solved by newer subversion features, but that's neither here nor there) and didn't cause any problem because we could lay out the tree any way that made sense and we always had a global revision number. A classic example: clang-tools-extra. At the time it was added, it was perceived as very useful to segregate. These days, I'm not sure the risk is interesting any more, and the cost is probably higher than the benefit. But it probably doesn't make sense to have a "cfe" directory and "clang-tools-extra" directory as peers. If we're moving to a monolithic repo, the clang-tools-extra stuff should almost *certainly* move under the 'tools' directory in the clang repo, where ever that ends up.
> 
> So, if we go with #1 above and just use the existing subversion repos as the top level directories, how would we rationally make a decision in the future about "should X new directory be a top-level directory, or a just fit it into the existing hierarchy?". I don't think we will ever have a good and principled response. We will constantly have oddball warts where things happen to be top-level because at one point we wanted the ability to not check out those Subversion repos, and now that has been enshrined.
> 
> 
> I'm not actually arguing that #2 is a *good* layout. But I think it is a (slightly) less arbitrary layout than #1.

I’m not sold on this: the existing build system layout could be seen even more arbitrary than the SVN repo separations. 


> And by breaking this weird mold of "all Subversion projects are top level", I think we'll be in a better place to make reasonably and considered decisions about re-structuring the layout long-term to reflect a useful and rational layout based on some set of reasonable technical principles.
> 
> It also has the advantage of being the layout which, if people's existing scripts and systems are set up around the defaults in the CMake build, will be the simplest to migrate to. I certainly know that all of my habits and patterns are geared around this layout and it will be dramatically easier for me to migrate to a single repo if it preserves this layout.

Again, I’m not convinced: right now the default is that “cmake path/to/llvm” is building *only* LLVM. The flat layout preserves naturally this behavior, while unifying everything using the build-system layout makes it difficult to get the existing behavior.


> Long term, I want to see us use a layout that reasonably connotes the logical and practical structure of the code and project as a whole. I also long-term want to see the layout effectively address the pragmatic needs of tools and systems developers rely on such as "git log". On the whole, I think #2 is (slightly) closer to that than #1 so I strongly prefer it, but it clearly isn't perfect here. I just think we can incrementally fix and improve the layout over time. I don't think we're stuck in a single layout forever.

`git log` is suddenly harder i.e. instead of getting llvm, I get libcxx + libcxx-abi + compiler-rt + clang + clang-tools-extras + lld + …

Also it is not clear to me why the layout can’t be fixed gradually over time (`git mv`) starting from the flat one.

Since the existing build system (and none of the tooling) is currently able to handle #2 (but could be fine with #1), I’m not convinced about "starting with #2 and then improving the situation” because  the disruption seems to important this way.

— 
Mehdi

> 
> Hope this helps motivate why I would very much prefer to retain the default layout suggested in our docs and build system for now, and phrase any re-organization as follow-on changes once we had a single repo that made such changes straightforward and easily history-preserving.
> 
> -Chandler
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160724/940d4078/attachment.html>


More information about the llvm-dev mailing list