[llvm-foundation] LLVM Infrastructure

Renato Golin via llvm-foundation llvm-foundation at lists.llvm.org
Wed May 18 07:26:53 PDT 2016


Folks,

I wanted to understand what's the foundation plans on infrastructure.

According to the announcement [1], the foundation is responsible for
overseeing all LLVM tools, websites and general infrastructure.

Although some work was done already, I feel most of them were to
remediate the worst immediate problems. I believe some investigation
is necessary, and shared in the right forum. It might not be this
list, but I'll take the risk of being wrong here, rather than start
another giant thread on the main lists.

Basically, I'm proposing a three step process for all our infrastructure:

1. Investigate the problems and solutions we have, potentially asking
on the main lists.
2. Propose a set of changes (publicly or privately), from cheaper to
most expensive, and make the cut on what we can afford.
3. Draw a plan, and when publicly visible changes are due, make sure
the proposal and schedule are fair, and do it.

For example, changing the web server from cloud to cloud makes no
difference, so it doesn't need to be a public process, but changing
our repository provider may affect a lot of internal processes, and
people will react badly if nothing is shared with them beforehand, so
we need some exposure beforehand.

The areas I think we must improve:


  A. Code repositories

The SVN server is reasonably unstable, having outages that affect all buildbots.

There was a migration earlier this year, I don't know if the repo was
involved, but we still had an outage a few weeks ago. I don't know
what the cost of hosting our own "stable enough" repository versus
paying some other SVN hosting company to do that for us.

The benefit of using code hosting companies is that they have a
larger, distributed and more stable infrastructure specifically
tailored to code hosting, which is something that would be very
expensive for us to do. But we may get away with slightly more than
what we have today and not pay too much.

In my view, this requires a deeper investigation, list of prices
versus features on varied providers to make an informed proposal.
It'll also require community involvement, as this changes the core of
what we do.

Same goes for the Git repo, which is a lot better than SVN, but could
reduce costs if we moved to a FOSS friendly host.


  B. Buildbots

Our current build master is *very* slow. Buildbot itself is slow, I
know, but with the number of people we have looking at those bots, it
can take several seconds to a minute to get a page back. We may need
an individual server (cloud instance) for this, and scale as
necessary.

Also, the current master is on version 0.8.5, which besides being
ancient, has several drawbacks:
 * It doesn't support newer SQLAlchemy, and it has trouble with newer
buildslaves that use them.
 * It doesn't support submitting patches to a particular build, making
pre-commit buildbots impossible.
 * The new versions have better support for Windows builders

Plus a huge list of changes around SVN/Git actions, authentication,
RSS feeds, stability, interaction with other systems (Gerrit, GitHub,
etc).

But a migration to a newer build master would probably require a large
scale Zorg refactoring.

I don't know the costs of the migration, nor I know the costs of the
current solution. We need a clear picture and maybe propose a few ways
out of stagnation:
 - Start a new master elsewhere, slowly move the bots towards it
 - Refresh the master in compatibility mode, slowly move slaves, switch
 - Deprecate buildbots and move everyone to Jenkins?

Whatever works for everyone.


  C. Bugzilla

Bugzilla is great, I'm one of the weirdos that actually like it. But
our bugzilla is also severely outdated, and the internal organisation
disconnected with how the project evolved over the years.

I think we should update our Bugzilla purely in the interest of bug
fixes and security issues, as I don't know anything that we may want
from a newer version. Some other people might...

Also, if we're going to be writing scripts to automate creating bugs
and scanning through Bugzilla web services, it may be a bit more
loaded than it is now, and I don't think it's great as it is.

Since this is mostly a web service anyway, upgrading the version
should bear very little community impact, so it's more about the cost
of a new server (cloud instance) and the migration itself than
anything else.


  D. Phabricator

Phab is a good tool, but I believe we have a copy that has been
modified to suit our needs and thus diverged from whatever it is
upstream. I think it's ok to modify our tools, but very little effort
has been put into finishing the modifications, for example,
understanding inline email replies. This is not a trivial task, but
far too many people have had this problem in the main Phab Phab, and I
remember reading that, due to our changes, it may not be easy to
upgrade.

I personally think that local modifications without an effort to
upstream is against the goals of FOSS communities and we shouldn't be
promoting it ourselves. All in all, I think this deserves at least a
report on how good/bad it is, and how we should improve the bad things
(ie. lack of updates).


  E. Others

I believe that the email and web pages infrastructure is now good
enough and fit for purpose, but it also needs to be part of the
overall plan (below). If I'm not mistaken, this is the server that was
just upgraded, so that shows some progress, which is highly welcomed.


  Z. Update Plan

In the end, I think we need to think about how much work to put into
updating the infrastructure, from moving to new servers, to updating
software, to changing into new solutions. Since this is something that
can disrupt the community at large (either updating or not), this
should also be shared with the larger community, so that we have a
clear picture of what to expect and what to request, if serious
problems arise.

Until now, we've been very relaxed with using tools, like when
Chandler introduced Phabricator. I think that's a perfectly valid way
of introducing new concepts and tools, but not so much in maintaining
them. But the more we depend on the tools we add, the more care we
need to put in keeping them available, fast, up-to-date and secure.

With all that in mind, my question is: what is the Foundation's plan
for the core LLVM's infrastructure maintenance and improvements, and
how can the rest of the community help in defining and implementing
those plans?

cheers,
--renato

[1] http://blog.llvm.org/2014/04/the-llvm-foundation.html


More information about the llvm-foundation mailing list