[cfe-dev] [llvm-dev] Testing Best Practices/Goals (in the context of compiler-rt)

Fri Feb 26 14:15:07 PST 2016

On Fri, Feb 26, 2016 at 2:10 PM, Alexey Samsonov <vonosmas at gmail.com> wrote:

>
> On Fri, Feb 26, 2016 at 1:34 PM, David Blaikie <dblaikie at gmail.com> wrote:
>
>>
>>
>> On Fri, Feb 26, 2016 at 1:31 PM, Sean Silva <chisophugis at gmail.com>
>> wrote:
>>
>>>
>>>
>>> On Fri, Feb 26, 2016 at 1:11 PM, David Blaikie <dblaikie at gmail.com>
>>> wrote:
>>>
>>>>
>>>>
>>>> On Fri, Feb 26, 2016 at 1:07 PM, Sean Silva <chisophugis at gmail.com>
>>>> wrote:
>>>>
>>>>>
>>>>>
>>>>> On Wed, Feb 17, 2016 at 8:45 AM, David Blaikie via cfe-dev <
>>>>> cfe-dev at lists.llvm.org> wrote:
>>>>>
>>>>>>
>>>>>>
>>>>>> On Fri, Feb 12, 2016 at 5:43 PM, Alexey Samsonov <vonosmas at gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Thu, Feb 11, 2016 at 1:50 PM, David Blaikie <dblaikie at gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Wed, Feb 10, 2016 at 3:55 PM, Alexey Samsonov via cfe-dev <
>>>>>>>> cfe-dev at lists.llvm.org> wrote:
>>>>>>>>
>>>>>>>>> I mostly agree with what Richard and Justin said. Adding a few
>>>>>>>>> notes about the general strategy we use:
>>>>>>>>>
>>>>>>>>> (1) lit tests which look "end-to-end" proved to be way more
>>>>>>>>> convenient for testing runtime libraries than unit tests.
>>>>>>>>>
>>>>>>>> We do have
>>>>>>>>> the latter, and use them to provide test coverage for utility
>>>>>>>>> functions, but we quite often accompany fix to the runtime library with
>>>>>>>>> "end-to-end" small reproducer extracted from the real-world code
>>>>>>>>> that exposed the issue.
>>>>>>>>> Incidentally, this tests a whole lot of other functionality: Clang
>>>>>>>>> driver, frontend, LLVM passes, etc, but it's not the intent of the test.
>>>>>>>>>
>>>>>>>>
>>>>>>>> Indeed - this is analogous to the tests for, say, LLD that use
>>>>>>>> llvm-mc to produce the inputs rather than checking in object files. That
>>>>>>>> area is open to some discussion as to just how many tools we should rope
>>>>>>>> in/how isolated we should make tests (eg: maybe building the json object
>>>>>>>> file format was going too far towards isolation? Not clear - opinions
>>>>>>>> differ). But the point of the test is to test the compiler-rt functionality
>>>>>>>> that was added/removed/modified.
>>>>>>>>
>>>>>>>> I think most people are in agreement with that, while acknowledging
>>>>>>>> the fuzzy line about how isolated we might be.
>>>>>>>>
>>>>>>>
>>>>>>> Yes.
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>> These tests are sometimes platform-specific and poorly portable,
>>>>>>>>> but they are more reliable (we make the same steps as the
>>>>>>>>> user of the compiler), and serve the purpose of documentation.
>>>>>>>>>
>>>>>>>>> (2) If we change LLVM instrumentation - we add a test to LLVM. If
>>>>>>>>> we change Clang code generation or driver behavior - we add
>>>>>>>>> a test to Clang. No excuses here.
>>>>>>>>>
>>>>>>>>> (3) Sometimes we still add a compiler-rt test for the change in
>>>>>>>>> LLVM or Clang: e.g. if we enhance Clang frontend to teach UBSan
>>>>>>>>> to detecting yet another kind of overflow, it makes sense to add a
>>>>>>>>> test to UBSan test-suite that demonstrates it, in addition to
>>>>>>>>> Clang test verifying that we emit a call to UBSan runtime. Also,
>>>>>>>>> compiler-rt test would allow us to verify that the actual error report
>>>>>>>>> we present to the user is sane.
>>>>>>>>>
>>>>>>>>
>>>>>>>> This bit ^ is a bit unclear to me. If there was no change to the
>>>>>>>> UBSan runtime, and the code generated by Clang is equivalent/similar to an
>>>>>>>> existing use of the UBSan runtime - what is it that the new compiler-rt
>>>>>>>> test is providing? (perhaps you could give a concrete example you had in
>>>>>>>> mind to look at?)
>>>>>>>>
>>>>>>>
>>>>>>> See r235568 (change to Clang) followed by r235569 (change to
>>>>>>> compiler-rt test). Now, it's a cheat because I'm fixing test, not adding
>>>>>>> it. However, I would have definitely added it, if it was missing.
>>>>>>>
>>>>>>
>>>>>> Right, I think the difference here is "if it was missing" - the test
>>>>>> case itself seems like it could be a reasonable one (are there other tests
>>>>>> of the same compiler-rt functionality? (I assume the compiler-rt
>>>>>> functionality is the implementation of sadd/ssub?))
>>>>>>
>>>>>>
>>>>>>> In this case, a change to Clang
>>>>>>> instrumentation (arguments passed to UBSan runtime callbacks)
>>>>>>> improved the user-facing part of the tool, and compiler-rt test suite is a
>>>>>>> good place to verify that.
>>>>>>>
>>>>>>
>>>>>> This seems like the problematic part - changes to LLVM improve the
>>>>>> user-facing part of Clang, but we don't add end-to-end tests of that, as a
>>>>>> general rule. I'm trying to understand why the difference between that and
>>>>>> compiler-rt
>>>>>>
>>>>>
>>>>> In what way do changes in LLVM change the user-facing part of Clang?
>>>>> It obviously depends on how broadly one defines user-facing. Is a 1%
>>>>> performance improvement from a particular optimization user-facing? Is
>>>>> better debug info accuracy user-facing? I'm not sure. But it seems clear
>>>>> that "the user sees a diagnostic or not" definitely is.
>>>>>
>>>>
>>>> There's more than just performance in LLVM - ABI features, and yes, I'd
>>>> argue some pieces of debug info are pretty user facing (as are some
>>>> optimizations). We also have the remarks system in place now. (also the
>>>> compiler crashing (or not) is pretty user facing).
>>>>
>>>
>>> I'd argue that we probably should have some sort of integration tests
>>> for ABI features. I think at the moment we're getting by thanks to
>>> self-hosting and regularly building lots of real-world programs with
>>> ToT-ish compilers.
>>>
>>
>> Perhaps so, but I'd argue that they shouldn't be run as part of "make
>> check" & should be in a separate test grouping (probably mostly run by
>> buildbots) for the purpose of integration testing.
>>
>
> If you have llvm/clang/compiler-rt/libc++/libc++abi checkout, they are not
> run as a part of "make check", only "make check-all", which kind of makes
> sense (run *all* the tests!). You're free to run "make check-clang", "make
> check-asan" etc.
> if you're sure your changes are limited in scope. Just to be clear - do
> you suggest that compiler-rt tests are too heavy for this configuration,
> and want to introduce extra level - i.e. extract "make check-compiler-rt"
> out of "make check-all", and introduce "make check-absolutely-everything",
> that would encompass them?
>

Fair point, check-all would be/is check all the features, but, yes,
potentially pulling out a separate subgroup for integration testing of any
kind. Yes, I realize this has the "-Wall" problem. But I suppose what I'm
getting at is "check" in the LLVM project parlance is, with the exception
of compiler-rt, the "lightweight, immediate verification tests, not
integration testing" & I would rather like that to be the case across the
board.

Perhaps using a different verb for cross-project testing (I don't object to
a catch-all target for running all the different kinds of testing we have).
Naming is hard.

>
>
>> We've made a pretty conscious, deliberate, and consistent effort to not
>> do integration testing across the LLVM projects in "make check"-like
>> testing, and to fix it where we do find it. It seems to me that compiler-rt
>> diverged from that approach and I'm not really in favor of that divergence.
>>
>
> I don't see why consistency by itself is a good thing.
>

Makes it easier to work between projects. Sets expectations for developers
when they work in one area or another.

> As a sanitizer developer, current situation is convenient for me, but if
> it harms / slows down / complicates workflow for other developers or LLVM
> as a whole - sure, let's fix it.
>

One of the things I'm trying to understand is what the benefit of this
extra testing is - to take the add/sub example again, is adding extra
coverage for printing different text through the same mechanism in
compiler-rt valuable? What sort of regressions/bugs is that catching that
makes it compelling to author, maintain, and incur the ongoing execution
cost of?

>
>
>>
>> - David
>>
>>
>>>
>>> -- Sean Silva
>>>
>>>
>>>>
>>>>
>>>>> Also, I think part of this is that in compiler-rt there are usually
>>>>> more moving parts we don't control. E.g. it isn't just the interface
>>>>> between LLVM and clang. The information needs to pass through archivers,
>>>>> linkers, runtime loaders, etc. that all may have issues that affect whether
>>>>> the user sees the final result. In general the interface between LLVM and
>>>>> clang has no middlemen so there really isn't anything to check.
>>>>>
>>>>
>>>> Correctness/behavior of the compiler depends on those things too
>>>> (linkers, loaders, etc) to produce the final working product the user
>>>> requested. If we emitted symbols with the wrong linkage we could produce
>>>> linker errors, drop important entities, etc. But we don't generally test
>>>> that the output of LLVM/Clang produces the right binary when linked, we
>>>> test that it produces the right linkages on the resulting entities.
>>>>
>>>> - David
>>>>
>>>>
>>>>>
>>>>> -- Sean Silva
>>>>>
>>>>>
>>>>>>
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>> You may argue that Clang test would have been enough (I disagree
>>>>>>> with that), or that it qualifies as "adding coverage" (maybe).
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>> (4) True, we're intimidated by test-suite :) I feel that current
>>>>>>>>> use of compiler-rt test suite to check compiler-rt libs better follows
>>>>>>>>> the doctrine described by David.
>>>>>>>>>
>>>>>>>>
>>>>>>>> Which David? ;) (I guess David Li, not me)
>>>>>>>>
>>>>>>>
>>>>>>> Nope, paragraph 2 from your original email.
>>>>>>>
>>>>>>>
>>>>>>>> I think maybe what could be worth doing would be separating out the
>>>>>>>> broader/intentionally "end to end" sort of tests from the ones intended to
>>>>>>>> test compiler-rt in relative isolation.
>>>>>>>>
>>>>>>>
>>>>>>> It's really hard to draw the line here, even some of compiler-rt
>>>>>>> unit tests require instrumentation, therefore depend on new features of
>>>>>>> Clang/LLVM. Unlike builtins, which are
>>>>>>> trivial to test in isolation, testing sanitizer runtimes in
>>>>>>> isolation (w/o compiler) is often hard to implement (we tried to do so for
>>>>>>> TSan, but found unit tests extremely hard to write),
>>>>>>> and is barely useful - compiler-rt runtimes don't consist of modules
>>>>>>> (like LLVMCodeGen and LLVMMC for instance), and are never used w/o the
>>>>>>> compiler anyway.
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>> Most importantly, I'd expect only the latter to run in a "make
>>>>>>>> check-all" run, as we do for Clang/LLVM, etc.
>>>>>>>>
>>>>>>>
>>>>>>> And now we're getting to the goals :) Why would such a change be
>>>>>>> good? Do you worry about the time it takes to execute the test suite?
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>> Also, there's significant complexity in compiler-rt test suite
>>>>>>>>> that narrows the tests executed
>>>>>>>>> to those supported by current host.
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Wed, Feb 10, 2016 at 2:33 PM, Xinliang David Li via cfe-dev <
>>>>>>>>> cfe-dev at lists.llvm.org> wrote:
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Wed, Feb 10, 2016 at 2:11 PM, Justin Bogner via llvm-dev <
>>>>>>>>>> llvm-dev at lists.llvm.org> wrote:
>>>>>>>>>>
>>>>>>>>>>> David Blaikie via cfe-dev <cfe-dev at lists.llvm.org> writes:
>>>>>>>>>>> > Recently had a bit of a digression in a review thread related
>>>>>>>>>>> to some tests
>>>>>>>>>>> > going in to compiler-rt (
>>>>>>>>>>> >
>>>>>>>>>>> http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20160208/330759.html
>>>>>>>>>>> > ) and there seems to be some disconnect at least between my
>>>>>>>>>>> expectations
>>>>>>>>>>> > and reality. So I figured I'd have a bit of a discussion out
>>>>>>>>>>> here on the
>>>>>>>>>>> > dev lists where there's a bit more visibility.
>>>>>>>>>>> >
>>>>>>>>>>> > My basic expectation is that the lit tests in any LLVM project
>>>>>>>>>>> except the
>>>>>>>>>>> > test-suite are targeted tests intended to test only the
>>>>>>>>>>> functionality in
>>>>>>>>>>> > the project. This seems like a pretty well accepted doctrine
>>>>>>>>>>> across most
>>>>>>>>>>> > LLVM projects - most visibly in Clang, where we make a
>>>>>>>>>>> concerted effort not
>>>>>>>>>>> > to have tests that execute LLVM optimizations, etc.
>>>>>>>>>>> >
>>>>>>>>>>> > There are exceptions/middle ground to this - DIBuilder is in
>>>>>>>>>>> LLVM, but
>>>>>>>>>>> > essentially tested in Clang rather than writing LLVM unit
>>>>>>>>>>> tests. It's
>>>>>>>>>>> > somewhat unavoidable that any of the IR building code
>>>>>>>>>>> (IRBuilder,
>>>>>>>>>>> > DIBuilder, IR asm printing, etc) is 'tested' incidentally in
>>>>>>>>>>> Clang in
>>>>>>>>>>> > process of testing Clang's IR generation. But these are seen
>>>>>>>>>>> as incidental,
>>>>>>>>>>> > not intentionally trying to cover LLVM with Clang tests (we
>>>>>>>>>>> don't add a
>>>>>>>>>>> > Clang test if we add a new feature to IRBuilder just to test
>>>>>>>>>>> the IRBuilder).
>>>>>>>>>>> >
>>>>>>>>>>> > Another case with some middle ground are things like linker
>>>>>>>>>>> tests and
>>>>>>>>>>> > objdump, dwarfdump, etc - in theory to isolate the test we
>>>>>>>>>>> would checkin
>>>>>>>>>>> > binaries (or the textual object representation lld had for a
>>>>>>>>>>> while, etc) to
>>>>>>>>>>> > test those tools. Some tests instead checkin assembly and
>>>>>>>>>>> assemble it with
>>>>>>>>>>> > llvm-mc. Again, not to cover llvm-mc, but on the assumption
>>>>>>>>>>> that llvm-mc is
>>>>>>>>>>> > tested, and just using it as a tool to make tests easier to
>>>>>>>>>>> maintain.
>>>>>>>>>>> >
>>>>>>>>>>> > So I was surprised to find that the compiler-rt lit tests seem
>>>>>>>>>>> to diverge
>>>>>>>>>>> > from this philosophy & contain more intentional end-to-end
>>>>>>>>>>> tests (eg:
>>>>>>>>>>> > adding a test there when making a fix to Clang to add a
>>>>>>>>>>> counter to a
>>>>>>>>>>> > function that was otherwise missing a counter - I'd expect
>>>>>>>>>>> that to be
>>>>>>>>>>> > tested in Clang and that there would already be coverage in
>>>>>>>>>>> compiler-rt for
>>>>>>>>>>> > "if a function has a counter, does compiler-rt do the right
>>>>>>>>>>> thing with that
>>>>>>>>>>> > counter" (testing whatever code in compiler-rt needs to be
>>>>>>>>>>> tested)).
>>>>>>>>>>> >
>>>>>>>>>>> > Am I off base here? Are compiler-rt's tests fundamentally
>>>>>>>>>>> different to the
>>>>>>>>>>> > rest of the LLVM project? Why? Should they continue to be?
>>>>>>>>>>>
>>>>>>>>>>> I think there's a bit of grey area in terms testing the runtime -
>>>>>>>>>>> generally it's pretty hard to use the runtime without a fairly
>>>>>>>>>>> end-to-end test, so tests of the runtime often end up looking
>>>>>>>>>>> pretty
>>>>>>>>>>> close to an end-to-end test.
>>>>>>>>>>>
>>>>>>>>>>> That said, I don't think that should be used as an excuse to
>>>>>>>>>>> sneak
>>>>>>>>>>> arbitrary end-to-end tests into compiler-rt. We should
>>>>>>>>>>> absolutely write
>>>>>>>>>>> tests in clang and llvm that we're inputting what we expect to
>>>>>>>>>>> the
>>>>>>>>>>> runtime and try to keep the tests in compiler-rt as focused on
>>>>>>>>>>> just
>>>>>>>>>>> exercising the runtime code as possible.
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Yes, we should not use compiler-rt tests as an excuse of not
>>>>>>>>>> adding clang/LLVM test. The latter should always be added if possible --
>>>>>>>>>> they are platform independent and is the first level of defense.  runtime
>>>>>>>>>> test's focus is also more on the runtime lib itself and interaction between
>>>>>>>>>>  runtime, compiler, binutils and other tools.
>>>>>>>>>>
>>>>>>>>>> David
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> IIUC, the correct place for integration tests in general is
>>>>>>>>>>> somewhere
>>>>>>>>>>> like test-suite, but I think it's a bit intimidating to some
>>>>>>>>>>> people to
>>>>>>>>>>> add new tests there (Are there docs on this?). I suspect some of
>>>>>>>>>>> the
>>>>>>>>>>> profiling related tests in compiler-rt are doing a bit much and
>>>>>>>>>>> should
>>>>>>>>>>> graduate to a spot in the test-suite (but I don't have time to
>>>>>>>>>>> volunteer
>>>>>>>>>>> to do the work, unfortunately).
>>>>>>>>>>> _______________________________________________
>>>>>>>>>>> LLVM Developers mailing list
>>>>>>>>>>> llvm-dev at lists.llvm.org
>>>>>>>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> _______________________________________________
>>>>>>>>>> cfe-dev mailing list
>>>>>>>>>> cfe-dev at lists.llvm.org
>>>>>>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Alexey Samsonov
>>>>>>>>> vonosmas at gmail.com
>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>>>> cfe-dev mailing list
>>>>>>>>> cfe-dev at lists.llvm.org
>>>>>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Alexey Samsonov
>>>>>>> vonosmas at gmail.com
>>>>>>>
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> cfe-dev mailing list
>>>>>> cfe-dev at lists.llvm.org
>>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>
>
> --
> Alexey Samsonov
> vonosmas at gmail.com
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20160226/a37df560/attachment.html>