[flang-dev] Testing infrastructure for F18
Doerfert, Johannes via flang-dev
flang-dev at lists.llvm.org
Tue Nov 12 19:36:13 PST 2019
On 11/12, Perry-Holby, Alexis wrote:
> Adding in my two cents... I think this temporary testing
> infrastructure through Drone CI is a great idea and we should
> absolutely do it, especially since it seems David has done basically
> all the legwork to get it functional already. Testing is vitally
> important, and the sooner we start regular CI testing the better.
Fully agreed on the importance and rather sooner than later part.
Especially if that is all set up, I don't want to stop anyone.
> However, if people are opposed to the temporary nature of the
> solution, then this is all the more reason to move getting the Flang
> codebase into the upstream monorepo up our collective priority list.
> We have to do that at some point anyway, and we might as well do it
> sooner rather than later.
I was under the impression there is no final solution yet but "only"
some investigation into different options (that people were familiar
with). As I mentioned in my last email (the inlined parts), we can
probably submit test results to labs.llvm.org without being in-tree,
thus we can be working on these things in parallel.
One option to get buildbot instances up and running without using own
machines would be to "rent" them at AWS, google cloud, what have you.
FWIW, buildbot configuration files, and other interesting things can be found
here: https://github.com/llvm/llvm-zorg
The OpenMP (host) test bot configuration looks like this:
https://github.com/llvm/llvm-zorg/blob/master/zorg/buildbot/builders/OpenMPBuilder.py
Afaik, it requires little more than one of those, a running buildbot
client, and an email to the buildmaster administrator to get things
running.
> On 11/12/19, 10:17 AM, "flang-dev on behalf of Doerfert, Johannes via flang-dev" <flang-dev-bounces at lists.llvm.org on behalf of flang-dev at lists.llvm.org> wrote:
>
> If a temporary solution turns out to be what we need, that's better than
> nothing. I am interested if we could set up the permanent solution
> instead ;)
>
> On 11/12, David Truby wrote:
> > I'm just proposing this as a temporary solution until we are inside
> > the monorepo and using LLVM's existing infrastructure in general. We
> > just wanted a quick and easy way to get CI up and running without
> > provisioning our own machines.
>
> Without provisioning "own" machines we could only ask the
> LLVM-Foundation to host the buildbots. I was hoping/thinking that some
> parties will eventually run their own buildbots though. If running
> machines is the limiting restriction right now, I guess we need to first
> build a temporary solution before we build a different one afterwards.
>
> > I couldn't see a way of using the LLVM buildbot infrastructure as an
> > external project (i.e. a project not inside the LLVM repo) but I may
> > have missed something, please let me know if I have!
>
> Given that Flang (=F18) is, for all intents and purposes, an LLVM
> project, we should be able/allowed to submit test results to the LLVM
> buildserver. That still requires us to setup buildbot instances
> somewhere, LLVM foundation servers or our own. Though, the latter is
> actually not too hard. We have configuration files in place and these
> instances can be completely inaccessible from the public. "All" they
> need to be able to do is: pull git repos, build code and run tests, and
> submit results via outgoing http POST request (afaik).
>
>
>
> > On Nov 11 2019, at 6:56 pm, Doerfert, Johannes <jdoerfert at anl.gov>
> > wrote: First, thanks for working on this!
> >
> > Did you consider setting up buildbot instances, as the public facing
> > LLVM testing infrastructure does? I don't think they need to be
> > externally accessible but all we care about is the reports (with log
> > files) that they send to http://lab.llvm.org:8011/ .
> >
> > Cheers, Johannes
> >
> >
> > On 11/11, David Truby via flang-dev wrote: Hi all, I have been
> > investigating the possibility of adding testing infrastructure to F18
> > on github, I have covered here the rationale and results of my
> > investigation, and would appreciate some thoughts/feedback on whether
> > this is something we want to move forward with.
> >
> > # Rationale
> >
> > F18 currently has no testing infrastructure upstream. This has led to
> > a few issues with certain additions breaking builds on certain
> > compilers. Since it is unrealistic to expect developers to test with
> > every compiler we claim to support, some testing infrastructure should
> > be in place to catch these issues. Such a measure would be a stop-gap
> > until F18 is fully integrated into the upstream LLVM infrastructure,
> > which has its own testing.
> >
> > # Investigated Services
> >
> > A number of possible options were investigated for use as a CI service
> > for F18, but investigation focused on the following services: * Travis
> > CI * Shippable * Jenkins * Drone CI
> >
> > Since F18 is an upstream open source project, the preference was to
> > attempt to use an externally accessible CI service rather than running
> > a service internally on Arm servers; this discounted Jenkins as it
> > requires self-hosting. Initially Travis seemed promising as it is a
> > commonly used service that is free for open source projects. However,
> > Travis only gives access to test nodes with 2 CPU cores, 8GB of RAM
> > and a 30 minute time limit: this is not sufficient for building F18.
> >
> > Shippable had similar issues due to having the same constraints on
> > resources, however Shippable does allow a "bring your own node" mode
> > where custom machines can be added to the CI for testing, similar to
> > LLVM’s buildbot. This would require provisioning nodes for CI though,
> > and since this is a temporary measure it would be best to avoid that
> > if possible.
> >
> > Drone CI is a newer service, and is less widely used and therefore
> > less well documented than the other services investigated. However,
> > the Drone cloud service is free for open source software and gives
> > access to entire bare metal AMD and Arm nodes provided by packet.com.
> > This allows F18 to be built in a more reasonable time, making it
> > feasible to use this for F18’s CI.
> >
> > # Drone CI
> >
> > For the reasons listed above, Drone CI was selected for further
> > investigation. It turned out to be fairly easy to add multiple build
> > configurations to the drone .star file, and get pre-commit working
> > with run times of around 10 minutes. Currently this has been tested
> > with clang-9/libc++ and gcc-9 on both amd64 and arm64. Other compilers
> > and configurations can be added easily, so a discussion should be had
> > as to what we want to actually test for.
> >
> > # User-facing consequences
> >
> > I have implemented support for this CI on my own branch of f18 to
> > check what the user-facing consequences of this look like. CI results
> > are reported through the github UI, in two separate places: firstly,
> > when a PR is submitted, on the PR review page a notification will
> > appear that CI is running, and will be updated to state whether the
> > tests passed or failed. A link is given to the run on cloud.drone.io
> > so that you can see which platform/compiler failed, and what the
> > failure output was. Additionally, after merging a PR, CI will run on
> > the merged commit, and give a notification if it fails.
> >
> > Currently these pre- and post-commit runs run the same number of
> > tests, as none of the f18 tests take very long to run. In future it
> > would be possible to separate the tests into short and long tests, as
> > LLVM does, and run the long tests only after merging. However since
> > this is a stop-gap measure until we move to LLVM’s testing
> > infrastructure this might never be necessary.
> >
> > It is possible to also set up emails to this mailing list (or a
> > separate mailing list) with post-commit test failures if this is
> > something that people would find useful.
> >
> > # Next steps
> >
> > If this is something we want to go ahead with, the next step is for me
> > to submit a PR with the required CI configuration file. Once this is
> > merged, someone with commit access will need to log in with their
> > github account to cloud.drone.io and activate CI for F18; this should
> > be as simple as pressing a single activate button next to the
> > repository name, however it may be necessary to change a couple of
> > settings as well. After the CI has been activated on drone, the
> > results should appear automatically in the github UI as mentioned
> > above.
> >
> > Please let me know what you think. David Truby
> >
> > _______________________________________________ flang-dev mailing list
> > flang-dev at lists.llvm.org
> > https://lists.llvm.org/cgi-bin/mailman/listinfo/flang-dev
> >
> >
> > --
> >
> > Johannes Doerfert Researcher
> >
> > Argonne National Laboratory Lemont, IL 60439, USA
> >
> > jdoerfert at anl.gov
>
> --
>
> Johannes Doerfert
> Researcher
>
> Argonne National Laboratory
> Lemont, IL 60439, USA
>
> jdoerfert at anl.gov
>
>
--
Johannes Doerfert
Researcher
Argonne National Laboratory
Lemont, IL 60439, USA
jdoerfert at anl.gov
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 228 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/flang-dev/attachments/20191113/9335f9c9/attachment.sig>
More information about the flang-dev
mailing list