[flang-dev] Testing infrastructure for F18

Tue Nov 12 07:53:51 PST 2019

Hi Johannes,

I'm just proposing this as a temporary solution until we are inside the monorepo and using LLVM's existing infrastructure in general. We just wanted a quick and easy way to get CI up and running without provisioning our own machines.
I couldn't see a way of using the LLVM buildbot infrastructure as an external project (i.e. a project not inside the LLVM repo) but I may have missed something, please let me know if I have!

David Truby

On Nov 11 2019, at 6:56 pm, Doerfert, Johannes <jdoerfert at anl.gov> wrote:
First, thanks for working on this!

Did you consider setting up buildbot instances, as the public facing
LLVM testing infrastructure does? I don't think they need to be
externally accessible but all we care about is the reports (with log
files) that they send to http://lab.llvm.org:8011/ .

Cheers,
Johannes

On 11/11, David Truby via flang-dev wrote:
Hi all,
I have been investigating the possibility of adding testing infrastructure to F18 on github, I have covered here the rationale and results of my investigation, and would appreciate some thoughts/feedback on whether this is something we want to move forward with.

# Rationale

F18 currently has no testing infrastructure upstream. This has led to a few issues with certain additions breaking builds on certain compilers. Since it is unrealistic to expect developers to test with every compiler we claim to support, some testing infrastructure should be in place to catch these issues. Such a measure would be a stop-gap until F18 is fully integrated into the upstream LLVM infrastructure, which has its own testing.

# Investigated Services

A number of possible options were investigated for use as a CI service
for F18, but investigation focused on the following services:
* Travis CI
* Shippable
* Jenkins
* Drone CI

Since F18 is an upstream open source project, the preference was to attempt to use an externally accessible CI service rather than running a service internally on Arm servers; this discounted Jenkins as it requires self-hosting. Initially Travis seemed promising as it is a commonly used service that is free for open source projects. However, Travis only gives access to test nodes with 2 CPU cores, 8GB of RAM and a 30 minute time limit: this is not sufficient for building F18.

Shippable had similar issues due to having the same constraints on resources, however Shippable does allow a "bring your own node" mode where custom machines can be added to the CI for testing, similar to LLVM’s buildbot. This would require provisioning nodes for CI though, and since this is a temporary measure it would be best to avoid that if possible.

Drone CI is a newer service, and is less widely used and therefore less well documented than the other services investigated. However, the Drone cloud service is free for open source software and gives access to entire bare metal AMD and Arm nodes provided by packet.com. This allows F18 to be built in a more reasonable time, making it feasible to use this for F18’s CI.

# Drone CI

For the reasons listed above, Drone CI was selected for further investigation. It turned out to be fairly easy to add multiple build configurations to the drone .star file, and get pre-commit working with run times of around 10 minutes. Currently this has been tested with clang-9/libc++ and gcc-9 on both amd64 and arm64. Other compilers and configurations can be added easily, so a discussion should be had as to what we want to actually test for.

# User-facing consequences

I have implemented support for this CI on my own branch of f18 to check what the user-facing consequences of this look like. CI results are reported through the github UI, in two separate places: firstly, when a PR is submitted, on the PR review page a notification will appear that CI is running, and will be updated to state whether the tests passed or failed. A link is given to the run on cloud.drone.io so that you can see which platform/compiler failed, and what the failure output was. Additionally, after merging a PR, CI will run on the merged commit, and give a notification if it fails.

Currently these pre- and post-commit runs run the same number of tests, as none of the f18 tests take very long to run. In future it would be possible to separate the tests into short and long tests, as LLVM does, and run the long tests only after merging. However since this is a stop-gap measure until we move to LLVM’s testing infrastructure this might never be necessary.

It is possible to also set up emails to this mailing list (or a separate mailing list) with post-commit test failures if this is something that people would find useful.

# Next steps

If this is something we want to go ahead with, the next step is for me to submit a PR with the required CI configuration file. Once this is merged, someone with commit access will need to log in with their github account to cloud.drone.io and activate CI for F18; this should be as simple as pressing a single activate button next to the repository name, however it may be necessary to change a couple of settings as well. After the CI has been activated on drone, the results should appear automatically in the github UI as mentioned above.

Please let me know what you think.
David Truby

_______________________________________________
flang-dev mailing list
flang-dev at lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/flang-dev

--

Johannes Doerfert
Researcher

Argonne National Laboratory
Lemont, IL 60439, USA

jdoerfert at anl.gov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/flang-dev/attachments/20191112/41cc7b73/attachment.html>