[llvm-dev] [RFC] Compiled regression tests.

Wed Jun 24 08:12:04 PDT 2020

On Wed, Jun 24, 2020 at 8:04 AM Michael Kruse <llvmdev at meinersbur.de> wrote:
>
> Am Mi., 24. Juni 2020 um 00:37 Uhr schrieb David Blaikie <dblaikie at gmail.com>:
> > I'm pretty change averse - and am in this case (but doesn't mean other folks won't be in favor and doesn't mean it isn't worth trying, etc - but if it were up to me, at the moment, I'd decline)
>
> That's understandable. New features also come with a cost that they
> need to recoup with their usefulness.
>
>
> > To take one point from the phab review (to try to keep the discussion here rather than there - might be worth a note on the phab review to discourage commenting there so the discussion doesn't get spread through different threads):
>
> I added a remark to the Phabrictor thread.
> (I kept these arguments there to keep the RFC itself short)
>
>
> > > Because of all of the above, maintenance of all the regression tests is a nightmare. I expect it to be a serious issue for introducing opaque pointers. My prediction is that we will have a typed-pointer command line flag to not have to require updating all the write-only regression tests.
> >
> > Actually I already did a lot of work with the initial patches years ago for opaque pointers (that added explicit types to gep/store/etc) and used (& provided in the commit messages) python scripts to migrate all the tests, both the IR itself and the CHECK text. This is probably more readily automatable than a more free-form C++ based checking proposed here.
> >
> > That said, it sounds like the proposal is a lot closer to the GUnit tests, and if this testing strategy is valuable, it seems like it could be mostly achieved by adding some API utilities (like the one you proposed in the patch) to make it more convenient to run optimization passes in a GUnit test. It doesn't seem to me like an #ifdef based approach to embedding IR in C++ would result in particularly more manageable/formattable code than a raw string. Perhaps the proposed improvements could be used to reduce/remove the cost of adding new GUnit tests/the need to touch CMake files/etc. (though I'd worry about the divergence of where optimization tests are written - as unit tests or as lit/FileCheck tests - that doesn't mean experimentation isn't worthwhile, but I think it'd require a pretty compelling justification to propose a replacement to the FileCheck approach (& perhaps a timeline for an experiment before either removing it, or deciding that it is the intended future state) - and if it's not a replacement, then I think we'd need to discuss what sort of situations this new thing is suitable and what FileCheck should be used for going forward)
>
> As mentioned in the Differential, generating the tests automatically
> will lose information about what actually is intended to be tested,

Agreed - and I didn't mean to suggest tests should be automatically
generated. I work pretty hard in code reviews to encourage tests to be
written as stable-y as possible so they are resilient to unrelated
changes. The python scripts I wrote to update tests didn't require
tests to be written in an automated fashion but were still able to
migrate the significant majority of hand-written FileCheck test cases.

> making it harder to understand the test.
> If the had a method to just update all tests (Polly as `ninja
> polly-update-format` to automatically update `ninja
> polly-check-format` failures), updating is easier, but because of the
> lack of understanding in practice most changes will just be glossed
> over.
>
> We already have a split between unittests (eg.
> llvm/Transforms/Scalar/LICMTests.cpp) and regression tests
> (llvm/test/Transforms/LICM/*.ll). New tests in D82426 are located next
> to the .ll files. The unittests could be moved there too.

Yep, if there's a nice way to change lit/cmake so that unit tests can
live alongside the FileCheck tests, I think that might be an
improvement. Though for now I'd only expect to find a unit test for
something when it's not reasonably FileCheck testable, in this case it
looks like the LICMTest.cpp is testing side effects on analyses when
running LICM, which makes some sense - analyses have no textual
representation, so can't be readily FileCheck tested, only their side
effects can.

- Dave