Mehdi Amini via llvm-dev llvm-dev at lists.llvm.org
Tue Jul 19 15:47:08 PDT 2016

> On Jul 19, 2016, at 3:32 PM, Renato Golin <renato.golin at linaro.org> wrote:
> On 19 July 2016 at 23:16, Mehdi Amini <mehdi.amini at apple.com> wrote:
>>> In the past, we were hit by web spiders that ignored completely the
>>> robots.txt file. Anton has made that better, but it can escalate if
>>> the spider realise we blocked them. There are ways to work around, but
>>> not without accidentally blocking innocent people (mostly in China).
>> That’s not relevant: this is about the WWW server, it does not have to be related to the hosting the repos.
> No, this is about hosting the SVN server. The SVN view was disabled
> for months this year before we could really see what was going on.

I don’t believe the online SVN viewer has to be on the server that hosts the repo that everyone access: the WWW server could mirror the SVN to provide local access to the viewer if needed (hence why I view this as unrelated to hosting source code).

>> Moving the SVN repo does not solve hosting videos, Debian packages, etc.
>> I suspect most of the bandwidth does not come from `svn up` or `git pull`.
> They share the same bandwidth, and sometimes the same server. It is relevant.

Well, “they share the same bandwidth” is exactly what I mean by “conflating the issues”.
They don’t *have to* share the same bandwidth. Hosting repos could be setup totally separated from hosting WWW.
You need to account things properly.

> One thing making SVN slow was the amount of Debian packages being
> downloaded form the same place.
>> Like… proper hooks?
> If we can work around it, and it seems we can, this is not such a big issue.
>> You’re again conflating svn/git and hosting “binaries and videos”. I don’t think we ever planned to host these on github?
> No, but they all share bandwidth. We moved videos to Youtube to
> offload the bandwidth, and moving the code to GitHub shares the same
> mindset.

It shares the same mindset *only* if the code itself is a significant bandwidth consumer, otherwise no it does not make sense.

>> Possibly, I don’t know, but that’s exactly why I asked for first hand data on the subject (i.e. Anton and/or Tanya) about hosting the git/SVN repos themselves, instead of hand-wavy “I believe” discussions.
> Bear in mind that I gave you facts (bandwidth problems, turned off SVN
> services, constant breakdowns, expertise in handling traffic, backup
> solutions).

And I consider many of the “facts” you gave to conflate other element than hosting the repository *alone*, which makes it hard to me to see them as relevant as-is.

> I also made you aware that the human cost is not *just* Tanya and
> Anton, but also me and everyone else that maintains buildbots,
> external mirrors, etc. and it *is* larger than the hardware costs. You
> just don't see it because we're all volunteers.
> Branding them as "hand-wavy I believe" is *not* appropriate.

I apologize if I hurt your feeling, but the reality is that I feel you’re conflating multiple things together that are not directly related to “moving the repository only”, and that does not help to be convincing. My use of “hand-wavy”, if that’s what bothered you, means really that (I’m not attaching any other value judgement to this expression as a non-native speaker, maybe it is not the right choice of word from a non-native speaker).


