[llvm-dev] GitHub Survey?

Thu Oct 13 12:45:38 PDT 2016

> On Oct 13, 2016, at 11:56 AM, Renato Golin <renato.golin at linaro.org> wrote:
> 
> Hi Duncan,
> 
> I don't understand your concerns.
> 
> First, the choice between sub-modules and mono-repo has been put
> forward as the only two choices because people felt that, if we let it
> open, we'd have too many different implementation details and we'd
> never get anywhere.
> 
> So...
> 
>> - how much pain the transition would cause, instead of what they think the right final state is.
> 
> The final state is defined by submod vs. monorepo, and that's
> represented in a different question. Those questions are addressing
> the additional work done to get there, as many have said would be the
> crucial decision point.
> 
> It also outlines the cost over their preferred vs non-preferred
> solutions, which leads to the aggregated cost over the whole project
> for each decision.
> 
> 
>> - what's good for the individuals responding, instead of what they think is best for the LLVM project; and
> 
> That's implied. I think it is clear enough, but we can always change
> the wording if others feel confused.
> 
> 
>> Secondly, I'm worried about this question: "How does the choice between a single repository with all projects and the use of sub-modules impact your usage of Git?"  I'm not sure we'll good signal from this; it's essentially a vote on the two variants, but it doesn't force the respondent to think about the specific issues.  I'd rather find a way to ask about the specific concerns raised in the document.
> 
> It is a vote. The "thinking" is on the extended answer that follows.
> Answers with good extended reasoning will have a greater weight than
> those without.
> 
> If you're worried about data mining, than leaving those questions to
> full text answers will require someone to read it all, interpret, and
> put their bias on top. Given the nature of this problem, we should
> avoid bias whenever possible, especially when interpreting the
> answers.
> 
> 
>> Thirdly, I'm worried that the follow-ups talk about "preferred" and "non-preferred" instead of "multirepo" and "monorepo".  This makes data-mining non-trivial (because the meaning depends on previous answers) and increases the chance of respondent confusion.
> 
> I see your point. We can re-word to make that more clear.
> 
>> 4. How often do you work on a small LLVM sub-project without using a checkout of LLVM itself?
>> - Always.
>> - Most of the time.
>> - Sometimes.
>> - Never.
> 
> Interesting, it covers the main problem with both proposals.
> 
> 
>> 5. Please categorize how you interact with upstream.
>> - I need read/write access, and I have limited disk space.
>> - I need read/write access, but a 1GB clone doesn't scare me.
>> - I only need read access.
> 
> I'm not sure that's critical. My current source repo has 35GB with
> just a few worktrees.
> 
> Also, both solutions have low-disk-usage modes, and this would make no
> difference on how we proceed.

This is a point of contention and a concern that Chris voiced about the monorepo. It should be in the survey.

> 
> 
> 
>> 6. How important is cross-project blame, grep, etc.?
>> - Vital.  I already use SVN/monorepo/custom-tooling to accomplish this.
>> - Extremely.  It should be easy enough that everyone does it by default.
>> - Somewhat.  I would use it if it were easy, but it's just nice to have.
>> - Not at all.  Anyone who cares can write their own tooling.
> 
> Based on other comments in the thread, we should leave this one out.

The point of the survey is to gather data. The fact that not much people are doing it, does not mean that after reading the proposal document they wouldn’t answer " It should be easy enough that everyone does it by default.”.

> 
> 
>> 7. Single-commit cross-project refactoring designs away a class of build failures and simplifies making API changes.  How important is it?
>> - Vital.  I already use SVN/monorepo/custom-tooling to accomplish this.
>> - Extremely.  It should be easy enough that everyone does it by default.
>> - Somewhat.  I would use it if it were easy, but it's just nice to have.
>> - Not at all.  Anyone who cares can write their own tooling.
> 
> I don't like to assert my opinion and then ask how much people agree.

I don’t see an “opinion” in the question.

> I prefer to ask the question directly, like:
> 
> How often do you need to commit across repositories (ex. llvm+clang)
> and how often are your builds broken because they're in separate
> repositories?

Asking it this way does not allows someone to answer "  It should be easy enough that everyone does it by default.”.

I prefer Duncan’s wording.

> 
> Also, I think your scale of important is somewhat skewed up. Vital and
> Extremely are at the top, somewhat is right bang in the middle and not
> at all is the very bottom.
> 
> You either have two positive and two negative (very, somewhat, not
> much, not at all) or you add a fifth in the middle. I prefer 4 because
> that makes people think harder.
> 
> 
>> 8. The multirepo variant provides read-only umbrella repository to coordinate commits between the split sub-project repositories using Git submodules.  Assuming multirepo gets adopted, how do you expect to use the umbrella?
>> // checkboxes:
>> + Actively contribute tooling improvements to improve it.
>> + Integrate it into our downstream fork.
>> + Use it for upstream contributions.
>> + Use it as the primary interface development environment.
>> + Use it for bisection.
> 
> Good. (+ N/A, too)
> 
> 
>> 9. If multirepo is adopted, how do you plan to contribute to upstream?
>> - Using Git submodules.
>> - Using the Git repos directly.
>> - Using the SVN bridges.
>> - n/a: I don't contribute.
>> 
>> 10. The monorepo variant provides read/write access to sub-projects via an SVN bridge and git-svn.  Contributors will have the option to continue using repositories split on project boundaries.  Assuming monorepo gets adopted, how do you plan to contribute?
>> - I'll use the monorepo as soon as it's possible, even before it's canonical.
>> - I'll use the monorepo as soon as it's canonical.
>> - I'll transition to monorepo eventually.
>> - I'll use the SVN bridge on separated sub-projects forever.
>> - I'll use a Git mirror (and/or git-svn) on separated sub-projects forever.
>> - n/a: I don't contribute.
>> 
>> 11. If monorepo is adopted, how do you plan to integrate it downstream?
>> - We already use monorepo.
>> - We'll switch to pulling from monorepo during the transition period.
>> - We'll switch to pulling from monorepo eventually.
>> - We'll integrate from the SVN bridge forever.
>> - We'll integrate from the split sub-project Git mirror forever.
>> - n/a: There is no downstream.
> 
> Good.
> 
> 
>> 12. The multi/mono hybrid variant merges some sub-projects, but leaves runtimes in separate repositories using the umbrella to tie them together.  Is this the best or worst of both worlds?
>> - This is great.  Native cross-project refactoring, without penalizing runtime-only developers.
>> - Whatever.  I'll deal with it.
>> - This is terrible.  All the transition pain of monorepo, without the advantages.
> 
> I didn't know we were proposing yet another variant. This seems like a
> last minute rushed in proposal and I don't want to endorse it in the
> survey. We can discuss them in the BoF, though.

We're not “endorsing” anything in the survey. We’re collecting data to help driving the BoF discussing the proposal document.

Before starting the survey design I stated that we should first have the proposal document ready, and the survey should ask the relevant question with respect to the proposal.

Also, this “variant” was discussed very early when the monorepo proposal came out.

> 
> 
>> 13. If multirepo is adopted, how much pain will there be in your transition?
>> - Nothing consequential.
>> - A little; but it'll be fine.
>> - A lot; but it'll get done somehow.
>> - Too much; I/we may stop contributing to LLVM.
>> 
>> 14. If monorepo is adopted, how much pain will there be in your transition?
>> - Nothing consequential.
>> - A little; but it'll be fine.
>> - A lot; but it'll get done somehow.
>> - Too much; I/we may stop contributing to LLVM.
> 
> Those are already covered by the current bad/good, but I'll change the
> wording to be like this one.
> 
> 
>> 15. If we could go back in time and restart the project with today's technologies, which repository scheme would be best for the LLVM project?
>> - CVS.
>> - Subversion repository with split sub-projects (<sub-project>/trunk), with git-svn.
>> - Subversion repository as a single project (trunk/<sub-project>), with git-svn.
>> - Git: multirepo variant.
>> - Git: monorepo variant.
>> - Git: multi/mono hybrid variant.
>> - Other.
> 
> Let's not put CVS in there, please. :)
> 
> So, what's the purpose of this question? I mean, we are "starting
> fresh" in a way, and the responses of the rest of the survey would
> make this question irrelevant, no?

We’re not “starting fresh”: we have downstream user integrating the repos, we have bug tracker referencing revisions, we have tooling (LNT, llvm-bisect …).
The sense of the question is “making abstraction of the pain of the transition” what is the “ideal” environment for developing LLVM.

— 
Mehdi

> 
> I'll be changing the wording on the ones we all agree on and leave the
> ones with questions until they're all solved.
> 
> cheers,
> --renato