[llvm-dev] Subprojects, GitHub, and the Monorepo

Sam McCall via llvm-dev llvm-dev at lists.llvm.org
Sat Oct 20 10:38:52 PDT 2018


I work on clangd, the language server/IDE backend in clang/tools/extra.
Clangd is at a stage where the core functionality is stable and useful
enough that we want to put it front of more users. I've been spending time
recently thinking about user-facing things: packaging, mailing lists, docs,
bugtracking.

And I think we should do much of this on GitHub, rather than *.llvm.org.
And not in the upcoming monorepo, but in a separate repository. (e.g.
github.com/llvm/clang)

I expect this to be controversial. It's definitely community fragmentation.
I think the reasons to do it for clangd are strong, but they won't apply
equally to all projects. And I'd like to know what people think. So here's
my reasoning.

*Point 1: It's what people expect.*
Everyone knows how to use the Github bug tracker, and has a Github account.
Everyone knows markdown, how to edit-and-preview, and how to send a doc
pull request.
Everyone has these workflows in their muscle memory when a github project
is the top websearch result.
(Current LLVM developers *also* know the LLVM equivalents, but that's a
small group).
This is largely why we're moving the code to Github, too.

*Point 2: exposing the LLVM monolith is bad for users.*
Clangd's customers don't care about the structure of the LLVM umbrella
project, or even that it exists.
If they search for clangd on the web, they want to find a tree that looks
like this:
clangd
- features
- installation
- bugs
- code
Not like this:
llvm.org
- docs
-- lldb, etc
-- clang
--- features, etc
--- tools
---- clang-tidy, etc
---- clangd
----- features
----- installation
- bugs
-- lldb, etc
-- clang
--- tools
---- clang-tidy, etc
---- clangd
- code
-- lldb, etc
...
LLVM's source repository is monolithic for technical reasons (versioning),
but we that's not a strong reason that the bug trackers, documentation etc
should be monolithic. Spraying hyperlinks around won't fix the fact that
the website is the wrong shape.

*Point 3: the tools are just better.*
I have nothing but respect and gratitude for the people that admin
bugzilla, wrangle CMake and sphinx to generate docs, and keep mailman
running. But unsurprisingly the state of the art has moved on, and the
equivalents are in my experience easier to use, faster, and more reliable.
Symptoms of this are people routing around the tools: LLDB doesn't use
sphinx for docs, sanitizers don't use bugzilla.
I'm sure there's going to be some agreement and disagreement on this point
:-)

*Point 4: but the tools are designed for smaller, focused repositories*
The "github-native" community is mostly using fairly narrowly scoped
repositories, and the tools work better this way. For example, labels are
enough to organize issues in a project the size of clangd, but too
lightweight if the scope is LLVM and all subprojects.

*What does the logical conclusion of this look like?*
I don't know. I suspect other subprojects in a similar boat may
independently come to the same conclusion. Projects that have e.g. lots of
bug history will need a migration story.
None of this mitigates the need for a source monorepo, so we'd be stuck
with all the code in llvm/llvm and just issues/docs in llvm/clangd. Not
ideal, but manageable.
Clangd is a pretty easy case, so I don't know if this makes it a good trial
or a bad one.

<*dons flame-retardant suit*>
What do you all think?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20181020/6bf88d11/attachment.html>


More information about the llvm-dev mailing list