[llvm-dev] Proposal: Make the VE target official

Wed Nov 10 06:08:25 PST 2021

On Mon, 2021-11-08 at 13:43 -0800, Philip Reames wrote:
> +1 to Renato's points.
> One extra point on the build bot is that your cycle time appears to
> be about 30 minutes.  That's not unreasonable, but faster cycles are
> always better (i.e. shorter blame lists).  Any chance you can reduce
> that time via e.g. more hardware or build config tweaks (such as
> ccache)?  I don't mean to suggest this as a blocking item, simple as
> an area where improvement is possible.
We should be able to bring that down. clang-ve-ninja currently builds
everything from scratch (and it's all static - i'd love to have working
shared component libraries for faster/incremental builds).

We are also considering a second, faster builder that only builds and
tests LLVM+Clang. That would be the canary for any issues with the VE
backend.

clang-ve-ninja would be the slow but thorough builder that includes all
supported runtimes and runs compiled code on the VE.

> Philip
> On 11/8/21 7:52 AM, Renato Golin via llvm-dev wrote:
>  
> > On Mon, 8 Nov 2021 at 14:56, Simon Moll <Simon.Moll at emea.nec.com>
> > wrote:
> >  
> > > If you look at the build logs of clang-ve-ninja, you will see
> > > that
> > > the
> > >  "check all" tests for LLVM+Clang have been passing for a while.
> > >  What's failing is compiler-rt and we have a patch for that.
> > > 
> > 
> > Right, what we mean by "green bots" is that there should be no
> > conditions for the bot to be considered a success. 
> > 
> > Buildbots *must* not only test all known functionality that is
> > expected to work, but they also must not be "red".
> > 
> > This is something that perhaps isn't clear on the new target
> > section of the documents but it's the modus operandi for a long
> > time.
> > 
> > If the bot is red, or turns red easily, it can't be relied upon to
> > convey success in target testing, because you can't expect non-NEC
> > developers to know what's good and what's not, or what should pass
> > and what shouldn't.
> > 
> > It's the responsibility of the bot owner (and ultimately, the
> > target's community), to make sure the bots accurately reflect the
> > quality of the target.
> > 
> > Therefore, a (perhaps undocumented) item on the checklist before
> > moving out of experimental is: the bots must test the target and
> > they must be green and stable (weeks without crashing for spurious
> > reasons).
> > 
> > In VE's case, looking at the earlier builds and seeing that "clang
> > check" passes them all, should be enough to assert history, but
> > before the target is built by everyone else, the bot must be green.
Thanks for shedding some light on the more implicit items on the
checklist.

Once D113093 is in, clang-ve-ninja is expected to be green.
We can call that the stable state - everything that's tested is
supposed to work and any red-ness implies breakage.

> > 
> > 
> >  
> > > Yes, the compiler-rt tests are failing for well understood
> > > reasons
> > >  (documented in the patch - check-all on LLVM+Clang is green).
> > > The
> > > patch
> > >  will make compiler-rt pass on VE by accounting for those (no
> > > denorm
> > >  support, syscall differences).
> > >  We explicitly include compiler-rt testing (even though it is
> > > failing)
> > >  to have LLVM-compiled code running on the VE in CI.. this is not
> > >  something we'd do for slick optics.
> > > 
> > 
> > Right, I've done the same thing when turning on the Arm back-end. I
> > built enough buildbots that shown that the target was working on
> > the basic level, then disabled the compiler-rt and test-suite that
> > were not passing with specific bugzilla items for each one, and
> > then with time, I fixed all of them and then all Arm bots were
> > green.
> > 
> > In your case, no other bot (should? will?) build compiler-rt for
> > VE, so this shouldn't hit other bots, which will start building VE
> > once it builds by default.
> > 
> > But your buildbot will still be the *only* bot that build VE proper
> > and uses hardware, so it will be the representative of the VE
> > target.
> > 
> > If it continues red, and it later on problems start to appear in
> > the LIT tests, then other developers will look at your bot, red for
> > ages, and will likely infer that no one cares, and disable the
> > broken test.
This may be a good moment to mention that the compiler-rt patch
disables tests that will never work on VE - there is no fp denormal
support, for example.

> > 
> > Overall, it's much easier if the main bot is green and all the
> > disabled tests have bugzilla entries showing that you are working
> > on it.
Using bugzilla for this makes sense - evidently for bugs but also to
track/document features that aren't ready yet. Besides that, i'd like
to have a CI-approach for turning on features (in particular runtimes,
which tend to be less incremental than the backend work). I am thinking
the following:

With the compiler-rt patch clang-ve-ninja will be green. The coverage
of that bot defines what's officially supported for VE at any given
moment.

We add a new staging buildbot that builds everything clang-ve-ninja
does plus the yet-unsupported features that we are currently working
on.
Initially, that bot will be 'red' while the official one has to be kept
'green'. Once we are confident about the feature/runtime - both bots
are 'green' - we will make the official bot test that feature, thereby
declaring the feature official. The staging bot will turn to new
experimental features.

Coming back to Philip's point about slow turn arounds, one advantage of
the "ambitious twin" of clang-ve-ninja is that we could experiment with
speeding up the build without affecting the official bot.

Btw, we are planning to port all LLVM runtimes to VE more or less. The
delta between clang-ve-ninja and the staging twin will mostly be
runtimes.

> > 
> >  
> > > The github repo is for reference only. If you look at our
> > > upstream
> > >  patch history, you will see that we submit small patches with
> > > tests and
> > >  follow the review protocol.
> > > 
> > 
> > I know, that's not what I meant.
> > 
> > My point is that it's really hard to use that branch for reference
> > because of all of the other non-VE stuff that is there too, bundled
> > in a single merge commit. 
> > 
> > cheers,
> > --renato

Thanks
- Simon

> >  
> >  
> > _______________________________________________
> > LLVM Developers mailing list
> > llvm-dev at lists.llvm.org
> > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>  
>