[llvm-dev] LLVM Releases: Upstream vs. Downstream / Distros

Wed May 11 09:16:07 PDT 2016

This is a long email :-) I've made some comments inline, but I'll
summarize my thoughts here:

- I like to think that the major releases have been shipped on a
pretty reliable six-month schedule lately. So we have that going for
us :-)

- It seems hard to align our upstream schedule to various downstream
preferences. One way would be to release much more often, but I don't
know if that's really desirable.

- I would absolutely like to see more involvement in the upstream
release processes from downstream folks and distros.

- I think we should use the bug tracker to capture issues that affect
releases. It would be cool if a commit hook could update bugzilla
entries that refer to it.

Cheers,
Hans

On Wed, May 11, 2016 at 7:08 AM, Renato Golin <renato.golin at linaro.org> wrote:
> Folks,
>
> There has been enough discussion about keeping development downstream
> and how painful that is. Even more so, I think we all agree, is having
> downstream releases while tracking upstream releases, trunk and other
> branches (ex. Android).
>
> I have proposed "en passant" a few times already, but now I'm going to
> do it to a wider audience:
>
> Shall we sync our upstream release with the bulk of other downstream
> ones as well as OS distributions?
>
>
> This work involves a *lot* of premises that are not encoded yet, so
> we'll need a lot of work from all of us. But from the recent problems
> with GCC abi_tag and the arduous job of downstream release managers to
> know which patches to pick, I think there has been a lot of wasted
> effort by everyone, and that generates stress, conflicts, etc.
>
> I'd also like to reinforce the basic idea of software engineering: to
> use the same pattern for the same problem. So, if we have one way to
> link sequences of patches and merge them upstream, we ought to use the
> same pattern (preferably the same scripts) downstream, too. Of course
> there will be differences, but we should treat them as the exception,
> not the rule.
>
> So, a list of things will need to be solved to get to a low waste
> release process:
>
>
>   1. Timing
>
> Many downstream release managers, as well as distro maintainers have
> complained about the timing of our releases, and how unreliable they
> are, and how that makes it hard for them to plan their own branches,
> cherry-picks and merges. If we release too early, they miss out
> important optimisations, if we do too late, they'll have to branch
> "just before" and risk having to back-port late fixes to their own
> modified trees.
>
> Products that rely on LLVM already have their own life cycles and we
> can't change that. Nor we can make all downstream products align to
> our quasi-chaotic release process. However, the important of the
> upstream release for upstream developers is *a lot* lower than for the
> downstream maintainers, so unless the latter group puts their weight
> (and effort) in the upstream process, little is going to happen to
> help them.
>
> A few (random) ideas:
>
>  * Do an average on all product cycles, pick the least costly time to
> release. This would marginalise those beyond the first sigma and we'd
> make their lives much harder than those within one sigma.
>  * Do the same average on the projects that are willing to lend a
> serious hand to the upstream release process. This has the same
> problem, but it's based on actual effort. It does concentrate bias on
> the better funded projects, but it's also easier for low key projects
> to change their release schedules.
>  * Try to release more often. The current cost of a release is high,
> but if we managed to lower it (by having more people, more automation,
> shared efforts), than it could be feasible and much fairer than
> weighted averages.

My random thoughts:

At least for the major releases, I think we're doing pretty well on
timing in terms of predictability: since 3.6, we have release every
six months: first week of March and first week of September (+- a few
days). Branching has been similarly predictive: mid-January and
mid-July.

If there are many downstream releases for which shifting this schedule
would be useful, I suppose we could do that, but it seems unlikely
that there would be agreement on this, and changing the schedule is
disruptive for those who depend on it.

The only reasonable way I see of aligning upstream releases with
downstream schedules would be to release much more often. This works
well in Chromium where there's a 6-week staged release schedule. This
would mean there's always a branch going for the next release, and
important bug fixes would get merged to that. In Chromium we drive
this from the bug tracker -- it would be very hard to scan each commit
for things to cherry-pick. This kind of process has a high cost
though, there has to be good infrastructure for it (buildbots on the
branch for all targets, for example), developers have to be aware, and
even then it's a lot of work for those doing the releases. I'm not
sure we'd want to take this on. I'm also not sure it would be suitable
for a compiler, where we want the releases to have long life-time.

>   2. Process
>
> Our release process is *very* lean, and that's what makes it
> quasi-chaotic. In the beginning, not many people / companies wanted to
> help or cared about the releases, so the process was what whomever was
> doing, did. The major release process is now better defined, but the
> same happened to the minor releases.
>
> For example, we have no defined date to start, or to end.

For the major releases, I've tried to do this. We could certainly
formalize it by posting it on the web page though.

> We have no
> assigned people to do the official releases, or test the supported
> targets. We still rely on voluntary work from all parties. That's ok
> when the release is just "a point in time", but if downstream releases
> and OS distributions start relying on our releases, we really should
> get a bit more professional.

Most importantly, those folks should get involved :-)

>
> A few (random) ideas:
>
>  * We should have predictable release times, both for starting it and
> finishing it. There will be complications, but we should treat them as
> the exception, not the rule.

SGTM, we pretty much already have this for major releases.

>  * We should have appointed members of the community that would be
> responsible for those releases, in the same way we have code owners
> (volunteers, but no less responsible), so that we can guarantee a
> consistent validation across all relevant targets. This goes beyond
> x86/ARM/MIPS/PPC and includes the other targets like AMD, NVidia, BPF,
> etc.

In practice, we kind of have this for at least some of the targets.
Maybe we should write this down somewhere instead of me asking for
(the same) volunteers each time the release process starts?

>  * The upstream release should be, as much as possible, independent of
> which OS they run on. OS specific releases should be done in the
> distributions themselves, and people interested should raise the
> concern in their own communities.
>  * Downstream managers should be an integral part of the upstream
> release process. Whenever the release manager sends the email, they
> should test on their end and reply with GREEN/RED flags.
>  * Downstream managers should also propose back-ports that are
> important to them in the upstream release. It's likely that a fix is
> important to a number of downstream releases but not many people
> upstream (since we're all using trunk). So, unless they tell us, we
> won't know.
>  * OS distribution managers should test on their builds, too. I know
> FreeBSD and Mandriva build by default with Clang. I know that Debian
> has an experimental build. I know that RedHat and Ubuntu have LLVM
> packages that they do care. All that has to be tested *at least* every
> major release, but hopefully on all releases. (those who already do
> that, thank you!)
>  * A number of upstream groups, or downstream releases that don't
> track upstream releases, should *also* test them on their own
> workloads. Doing so, will get the upstream release in a much better
> quality level, and in turn, allow those projects to use it on their
> own internal releases.
>  * Every *new* bug found in any of those downstream tests should be
> reported in Bugzilla with the appropriate category (critical / major /
> minor). All major bugs have to be closed for the release to be out,
> etc. (the specific process will have to be agreed and documented).
>
>
>   3. Automation
>
> As exposed in the timing and process sections, automation is key to
> reducing costs for all parties. We should collate the encoded process
> we have upstream with the process projects have downstream, and
> convert upstream everything that we can / is relevant.
>
> For example, finding which patches revert / fix another one that was
> already cherry-picked is a common task that should be identical to
> everyone. A script that would sweep the commit logs, looking for
> clues, would be useful to everyone.
>
> A few (random) ideas:
>
>  * We should discuss the process, express the consensus on the
> official documentation, and encode it in a set of scripts. It's a lot
> easier to tell a developer "please do X because it helps our script
> back-port your patch" than to say "please do X because it's nice" or
> "do X because it's in the 'guideline'".
>  * There's no way to force (via git-hook) developers to add a bugzilla
> ID or a review number on the commit message (not all commits are
> equal), so the scripts that scan commits will have to be smart enough,
> but that'll create false-positives, and they can't commit without
> human intervention. Showing why a commit wasn't picked up by the
> script, or was erroneously picked up, is a good way to educate people.
>  * We could have a somewhat-common interface with downstream releases,
> so some scripts that they use could be upstreamed, if many of them
> used the same entry point for testing, validating, building,
> packaging.
>  * We could have the scripts that distros use for building their own
> packages in our tree, so they could maintain them locally and we'd
> know which changes are happening and would be much easier to warn the
> others, common up the interface, etc.
>
>
> In the end, we're a bunch of people from *very* different communities
> doing similar work. In the spirit of open source, I'd like to propose
> that we share the work and the responsibility of producing high
> quality software with minimal waste.
>
> I don't think anyone receiving this email disagrees with the statement
> that we can't just take and not give back, and that being part of this
> community means we may have to work harder than our employers would
> think brings direct profit, so that they can profit even more
> indirectly later, and with that, everyone that uses or depends on our
> software.
>
> My personal and very humble opinion is that coalescing the release
> process will, in the long term, actually *save* us a lot of work, and
> the quality will be increased. Even if we don't reach perfection, and
> by no means I think we will, at least we'll have something slightly
> better. If anything, at least we tried.
>
> I'd hate to continue doing an inefficient process without even trying
> an alternative...
>
> Comments?
>
> cheers,
> --renato