[flang-dev] Testing infrastructure for F18

Tue Nov 12 10:12:36 PST 2019

Adding in my two cents... I think this temporary testing infrastructure through Drone CI is a great idea and we should absolutely do it, especially since it seems David has done basically all the legwork to get it functional already.  Testing is vitally important, and the sooner we start regular CI testing the better.  

However, if people are opposed to the temporary nature of the solution, then this is all the more reason to move getting the Flang codebase into the upstream monorepo up our collective priority list.  We have to do that at some point anyway, and we might as well do it sooner rather than later.

Best,
Alexis Perry-Holby
Los Alamos National Laboratory
Applied Computer Science (CCS-7)

On 11/12/19, 10:17 AM, "flang-dev on behalf of Doerfert, Johannes via flang-dev" <flang-dev-bounces at lists.llvm.org on behalf of flang-dev at lists.llvm.org> wrote:

    If a temporary solution turns out to be what we need, that's better than
    nothing. I am interested if we could set up the permanent solution
    instead ;)

    On 11/12, David Truby wrote:
    > I'm just proposing this as a temporary solution until we are inside
    > the monorepo and using LLVM's existing infrastructure in general. We
    > just wanted a quick and easy way to get CI up and running without
    > provisioning our own machines.

    Without provisioning "own" machines we could only ask the
    LLVM-Foundation to host the buildbots. I was hoping/thinking that some
    parties will eventually run their own buildbots though. If running
    machines is the limiting restriction right now, I guess we need to first
    build a temporary solution before we build a different one afterwards.

    > I couldn't see a way of using the LLVM buildbot infrastructure as an
    > external project (i.e. a project not inside the LLVM repo) but I may
    > have missed something, please let me know if I have!

    Given that Flang (=F18) is, for all intents and purposes, an LLVM
    project, we should be able/allowed to submit test results to the LLVM
    buildserver. That still requires us to setup buildbot instances
    somewhere, LLVM foundation servers or our own. Though, the latter is
    actually not too hard. We have configuration files in place and these
    instances can be completely inaccessible from the public. "All" they
    need to be able to do is: pull git repos, build code and run tests, and
    submit results via outgoing http POST request (afaik).

    > On Nov 11 2019, at 6:56 pm, Doerfert, Johannes <jdoerfert at anl.gov>
    > wrote: First, thanks for working on this!
    > 
    > Did you consider setting up buildbot instances, as the public facing
    > LLVM testing infrastructure does? I don't think they need to be
    > externally accessible but all we care about is the reports (with log
    > files) that they send to http://lab.llvm.org:8011/ .
    > 
    > Cheers, Johannes
    > 
    > 
    > On 11/11, David Truby via flang-dev wrote: Hi all, I have been
    > investigating the possibility of adding testing infrastructure to F18
    > on github, I have covered here the rationale and results of my
    > investigation, and would appreciate some thoughts/feedback on whether
    > this is something we want to move forward with.
    > 
    > # Rationale
    > 
    > F18 currently has no testing infrastructure upstream. This has led to
    > a few issues with certain additions breaking builds on certain
    > compilers. Since it is unrealistic to expect developers to test with
    > every compiler we claim to support, some testing infrastructure should
    > be in place to catch these issues. Such a measure would be a stop-gap
    > until F18 is fully integrated into the upstream LLVM infrastructure,
    > which has its own testing.
    > 
    > # Investigated Services
    > 
    > A number of possible options were investigated for use as a CI service
    > for F18, but investigation focused on the following services: * Travis
    > CI * Shippable * Jenkins * Drone CI
    > 
    > Since F18 is an upstream open source project, the preference was to
    > attempt to use an externally accessible CI service rather than running
    > a service internally on Arm servers; this discounted Jenkins as it
    > requires self-hosting. Initially Travis seemed promising as it is a
    > commonly used service that is free for open source projects. However,
    > Travis only gives access to test nodes with 2 CPU cores, 8GB of RAM
    > and a 30 minute time limit: this is not sufficient for building F18.
    > 
    > Shippable had similar issues due to having the same constraints on
    > resources, however Shippable does allow a "bring your own node" mode
    > where custom machines can be added to the CI for testing, similar to
    > LLVM’s buildbot. This would require provisioning nodes for CI though,
    > and since this is a temporary measure it would be best to avoid that
    > if possible.
    > 
    > Drone CI is a newer service, and is less widely used and therefore
    > less well documented than the other services investigated. However,
    > the Drone cloud service is free for open source software and gives
    > access to entire bare metal AMD and Arm nodes provided by packet.com.
    > This allows F18 to be built in a more reasonable time, making it
    > feasible to use this for F18’s CI.
    > 
    > # Drone CI
    > 
    > For the reasons listed above, Drone CI was selected for further
    > investigation. It turned out to be fairly easy to add multiple build
    > configurations to the drone .star file, and get pre-commit working
    > with run times of around 10 minutes. Currently this has been tested
    > with clang-9/libc++ and gcc-9 on both amd64 and arm64. Other compilers
    > and configurations can be added easily, so a discussion should be had
    > as to what we want to actually test for.
    > 
    > # User-facing consequences
    > 
    > I have implemented support for this CI on my own branch of f18 to
    > check what the user-facing consequences of this look like. CI results
    > are reported through the github UI, in two separate places: firstly,
    > when a PR is submitted, on the PR review page a notification will
    > appear that CI is running, and will be updated to state whether the
    > tests passed or failed. A link is given to the run on cloud.drone.io
    > so that you can see which platform/compiler failed, and what the
    > failure output was. Additionally, after merging a PR, CI will run on
    > the merged commit, and give a notification if it fails.
    > 
    > Currently these pre- and post-commit runs run the same number of
    > tests, as none of the f18 tests take very long to run. In future it
    > would be possible to separate the tests into short and long tests, as
    > LLVM does, and run the long tests only after merging. However since
    > this is a stop-gap measure until we move to LLVM’s testing
    > infrastructure this might never be necessary.
    > 
    > It is possible to also set up emails to this mailing list (or a
    > separate mailing list) with post-commit test failures if this is
    > something that people would find useful.
    > 
    > # Next steps
    > 
    > If this is something we want to go ahead with, the next step is for me
    > to submit a PR with the required CI configuration file. Once this is
    > merged, someone with commit access will need to log in with their
    > github account to cloud.drone.io and activate CI for F18; this should
    > be as simple as pressing a single activate button next to the
    > repository name, however it may be necessary to change a couple of
    > settings as well. After the CI has been activated on drone, the
    > results should appear automatically in the github UI as mentioned
    > above.
    > 
    > Please let me know what you think.  David Truby
    > 
    > _______________________________________________ flang-dev mailing list
    > flang-dev at lists.llvm.org
    > https://lists.llvm.org/cgi-bin/mailman/listinfo/flang-dev
    > 
    > 
    > --
    > 
    > Johannes Doerfert Researcher
    > 
    > Argonne National Laboratory Lemont, IL 60439, USA
    > 
    > jdoerfert at anl.gov

    -- 

    Johannes Doerfert
    Researcher

    Argonne National Laboratory
    Lemont, IL 60439, USA

    jdoerfert at anl.gov