[cfe-dev] Test Suite - Livermore Loops

David Blaikie dblaikie at gmail.com
Mon Jan 7 14:46:26 PST 2013


On Mon, Jan 7, 2013 at 2:23 PM, Daniel Dunbar <daniel at zuster.org> wrote:
>
>
>
> On Mon, Jan 7, 2013 at 2:17 PM, David Blaikie <dblaikie at gmail.com> wrote:
>>
>> On Mon, Jan 7, 2013 at 2:05 PM, Daniel Dunbar <daniel at zuster.org> wrote:
>> >
>> >
>> >
>> > On Mon, Jan 7, 2013 at 1:52 PM, David Blaikie <dblaikie at gmail.com>
>> > wrote:
>> >>
>> >> On Mon, Jan 7, 2013 at 1:46 PM, Daniel Dunbar <daniel at zuster.org>
>> >> wrote:
>> >> >
>> >> >
>> >> >
>> >> > On Mon, Jan 7, 2013 at 1:14 PM, David Blaikie <dblaikie at gmail.com>
>> >> > wrote:
>> >> >>
>> >> >> On Mon, Jan 7, 2013 at 12:58 PM, Daniel Dunbar <daniel at zuster.org>
>> >> >> wrote:
>> >> >> > To weigh in here...
>> >> >> >
>> >> >> >
>> >> >> > On Thu, Jan 3, 2013 at 8:15 AM, David Blaikie <dblaikie at gmail.com>
>> >> >> > wrote:
>> >> >> >>
>> >> >> >> +Daniel & Michael who work on the LNT infrastructure & might have
>> >> >> >> some
>> >> >> >> thoughts on the differences & their merits & motivations.
>> >> >> >>
>> >> >> >> On Thu, Jan 3, 2013 at 4:05 AM, Renato Golin
>> >> >> >> <renato.golin at linaro.org>
>> >> >> >> wrote:
>> >> >> >> > David,
>> >> >> >> >
>> >> >> >> > I got some more work on the Livermore Loops and I found out
>> >> >> >> > that
>> >> >> >> > the
>> >> >> >> > issue
>> >> >> >> > is the difference in the parameters between a single step and a
>> >> >> >> > multi
>> >> >> >> > step
>> >> >> >> > compilation.
>> >> >> >>
>> >> >> >> Thanks for the investigation.
>> >> >> >>
>> >> >> >> > When you compile "clang kernel06.c" it works fine, but when you
>> >> >> >> > get
>> >> >> >> > all
>> >> >> >> > steps (clang -emit-llvm + llvm-as + opt + llc etc), the
>> >> >> >> > defaults
>> >> >> >> > options
>> >> >> >> > of
>> >> >> >> > each and how they interact is showing a bug in the code
>> >> >> >> > generated.
>> >> >> >>
>> >> >> >> Sounds quite plausible.
>> >> >> >>
>> >> >> >> > This difference is due to the fact that I'm running the
>> >> >> >> > test-suite
>> >> >> >> > using
>> >> >> >> > LNT, while the build bots are running it using Make directly.
>> >> >> >> > I'd
>> >> >> >> > expect
>> >> >> >> > them both to be the same, but apparently they're quite
>> >> >> >> > different
>> >> >> >> > in
>> >> >> >> > what
>> >> >> >> > kind of parameters they use, passes they test and results they
>> >> >> >> > get.
>> >> >> >> >
>> >> >> >> > I think there are two courses of action here:
>> >> >> >> >
>> >> >> >> > 1. Identify the issue, isolate the case and create a bug to
>> >> >> >> > resolve
>> >> >> >> > later.
>> >> >> >> > 2. Make sure LNT does exactly what the build bots are doing
>> >> >> >>
>> >> >> >> Part of the issue here is whether or not the Make-based execution
>> >> >> >> is
>> >> >> >> still maintained/valued. I'm getting the impression that the LNT
>> >> >> >> execution may be already, or be becoming, the standard way to run
>> >> >> >> the
>> >> >> >> test suite even when not gathering perf statistics.
>> >> >> >> Michael/Daniel -
>> >> >> >> is that the case?
>> >> >> >
>> >> >> >
>> >> >> > Well, the distinction isn't really between LNT and non-LNT, its
>> >> >> > between
>> >> >> > the
>> >> >> > TEST=nightly and TEST=simple style supported by the Makefiles. LNT
>> >> >> > uses
>> >> >> > the
>> >> >> > TEST=simple style and that is all I care to support.
>> >> >>
>> >> >> Fair enough, though that's sort of what I was getting at in a way:
>> >> >> whatever way LNT is driving the test-suite is, essentially, the only
>> >> >> supported way. Sure we can have non-LNT bots (not ideal, perhaps -
>> >> >> still another path to maintain/possibly diverge by accident) but
>> >> >> they
>> >> >> certainly shouldn't be using anything other than the way LNT uses
>> >> >> the
>> >> >> test-suite (ie: TEST=simple).
>> >> >>
>> >> >> Can we kill TEST=nightly, then, since it's just an
>> >> >> untested/unsupported trap? Or do you know of users that have a need
>> >> >> for this?
>> >> >
>> >> >
>> >> > It's untested, but as supported as anything else (I try not to break
>> >> > it,
>> >> > and
>> >> > will fix bugs in it).
>> >> >
>> >> > And yes, there are still users that use this regularly. Most of that
>> >> > is
>> >> > probably habit among old-school LLVMers, but it's still useful when
>> >> > you
>> >> > want
>> >> > to do direct A/B testing of optimizer changes (support for things
>> >> > like
>> >> > OPTBETA and LLCBETA), or when you want to test a change without
>> >> > requiring a
>> >> > compiler rebuild.
>> >> >
>> >> > For example, we still don't have very good support in the compiler
>> >> > for
>> >> > tweaking various parts of the compilation process (for example,
>> >> > running
>> >> > with
>> >> > a custom pass list), so the easiest way to test addition of a new
>> >> > pass
>> >> > may
>> >> > still be using TEST=nightly.
>> >> >
>> >> > My natural tendency is towards "if it isn't broke, don't kill it",
>> >> > and
>> >> > not
>> >> > to try and remove it until we have a new separate way of running the
>> >> > test
>> >> > suite outside of the Makefiles.
>> >> >
>> >> >>
>> >> >> >
>> >> >> > Historically, the old way of testing (TEST=nightly) used the
>> >> >> > various
>> >> >> > LLVM
>> >> >> > tools to effect a compilation because there weren't compilers that
>> >> >> > worked.
>> >> >> > However, this is a bad way to "test" the product that most users
>> >> >> > actually
>> >> >> > care about, which is the compiler.
>> >> >> >
>> >> >> > With TEST=simple, all the compilation is done using the compiler
>> >> >> > just
>> >> >> > as
>> >> >> > an
>> >> >> > end user would. If you want LTO, the right way to get it is to use
>> >> >> > the
>> >> >> > compilers support for LTO. This is how we test LTO internally.
>> >> >> > I've
>> >> >> > never
>> >> >> > tried to get LTO working on Linux, but it should be possible using
>> >> >> > the
>> >> >> > gold
>> >> >> > plugin and passing the right compiler options.
>> >> >> >
>> >> >> >> If so, should we rip out the direct Make execution, or do
>> >> >> >> something
>> >> >> >> to
>> >> >> >> otherwise warn/disable it?
>> >> >> >
>> >> >> >
>> >> >> > Per my other thread polling users of the test-suite, there are
>> >> >> > still
>> >> >> > people
>> >> >> > who use the Makefiles to do more custom things. I personally would
>> >> >> > love
>> >> >> > to
>> >> >> > deprecate them completely, but they do support some useful
>> >> >> > workflows.
>> >> >> >
>> >> >> > My ideal would be:
>> >> >> > 1. Migrate LNT to drive the test-suite using a more sane mechanism
>> >> >> > (not
>> >> >> > a
>> >> >> > glob of Makefiles). I would like the "more sane mechanism" to be
>> >> >> > lit-based.
>> >> >> > 2. Maybe do some work to make using lit to drive the test-suite
>> >> >> > more
>> >> >> > convenient and hopefully support some of the useful workflows the
>> >> >> > Makefiles
>> >> >> > support with less of the crap.
>> >> >> > 3. Deprecate the Makefiles, or at least let the die through lack
>> >> >> > of
>> >> >> > maintenance.
>> >> >> >
>> >> >> > Does that answer the parts you wanted my input on?
>> >> >>
>> >> >> More or less, I suppose I wouldn't mind an opinion on the "should we
>> >> >> kill off/migrate bots from test-suite invocation to LNT?" issue too.
>> >> >> (my assumption is that your answer to that is "yes", but just want
>> >> >> to
>> >> >> be clear)
>> >> >
>> >> >
>> >> > Yes, definitely.
>> >>
>> >> Hmm, this seems at odds with your above opinion on not killing
>> >> TEST=nightly, though. If we actively migrate bots away from
>> >> TEST=nightly we're going to break it (indeed we Renato already has
>> >> which is how this thread came up).
>> >
>> >
>> > By "not breaking it", I meant the infrastructure of it, not whether or
>> > not
>> > the tests work or not.
>> >
>> > As for the actual LLVM bug, we should probably try and get an LTO LNT
>> > bot
>> > up, though, which would most likely hit the same bug.
>> >
>> >> If it's broken that, I would think,
>> >> is going to cause some confusion/problems for those using it &
>> >> expecting things to pass. Is your impression/experience that those
>> >> running this manually for custom testing aren't too concerned about
>> >> spurious failures?
>> >
>> >
>> > In my mind, I don't necessarily think "its broken" makes sense in this
>> > context. It's not TEST=nightly that is broken, it is the compiler in the
>> > context of a one architecture, one set of compile options, etc.
>>
>> Agreed. There's a distinction between the infrastructure being broken
>> & the tests not passing.
>>
>> > I expect/hope developers to be aware that compiler bugs may only
>> > manifest
>> > under a very specific set of circumstances, and so they need to run
>> > their
>> > tests in the same way as the buildbots if they want results to match.
>> > And I
>> > hope most core LLVM developers realize that the way TEST=nightly ends up
>> > building binaries is very different from using the compiler directly,
>> > but if
>> > not most of them figure this out very quickly in practice.
>>
>> Right, my concern is that if we leave "TEST=nightly" in it'll just be
>> a trap for people to run that & expect to get clean results when
>> there's no infrastructure ensuring that those tests pass at all on any
>> architecture.
>
>
> I don't think its really documented and the current docs steer towards LNT
> or TEST=simple, so I don't think this is a big problem.
>
>>
>> It'll easily become a hive of arbitrary failures & no
>> clear way to distinguish new failures from old (which will result in
>> people either not using it or trying to investigate issues/failures
>> that weren't introduced by their change anyway (or having to run it
>> twice - once to get a baseline and again to get their results &
>> carefully diffing between the two to see where the new failures are))
>
>
> If it becomes a hive of arbitrary failures that means that there are a bunch
> of untested and buggy paths in LLVM, so that is a much bigger problem and
> indicates a lack of test coverage.

This is already happening - Renato introduced a bug that only fired on
TEST=nightly (& not TEST=simple, because he only verified with LNT),
if it weren't for the buildbots (& my looking at them/pestering him)
we wouldn't've known it was failing until some random developer had
gone to run the test suite to verify some unrelated change.

>> This seems likely to waste engineer time & removing the option would
>> remove the pitfall/trap. But perhaps people using it are used to
>> arbitrary failures? I'm not sure.
>
>
> I wouldn't say they are used to arbitrary failures, but its not much
> different than finding a bug in some other part of LLVM when you change some
> unrelated part of codegen. This happens pretty frequently just due to the
> nature of compilers.

A little different - usually if you make a change that reveals a
latent bug elsewhere it's still a problem /now/ because the bug was
latent previously (hidden by another optimization, etc) and your
change will, effectively, introduce the bug to users. If the test
suite has existing failures in it & a developer is trying to run the
test suite to verify their change but are receiving failures that
their change didn't cause or reveal - so they end up investigating
things that were already broken & may or may not actually be important
& certainly shouldn't be important to /them/ (instead should be
important to whomever committed the regression, ideally).

- David

>
>  - Daniel
>
>
>>
>> > I see LNT + TEST=simple as the "right way" to do large scale testing w/
>> > the
>> > test suite and buildbots, but if developers want to use TEST=nightly for
>> > experiments or development its still there. I have actively tried to
>> > encourage people to switch over to using TEST=simple when possible, but
>> > its
>> > hard to get people to change existing workflows if there isn't a clear
>> > benefit.
>> >
>> >  - Daniel
>> >
>> >>
>> >>
>> >> - David
>> >>
>> >> >
>> >> >  - Daniel
>> >> >
>> >> >>
>> >> >>
>> >> >> - David
>> >> >>
>> >> >> >
>> >> >> >  - Daniel
>> >> >> >
>> >> >> >>
>> >> >> >> > I'm working on item 1 right now, not sure how item 2 can be
>> >> >> >> > solved...
>> >> >> >> >
>> >> >> >> > Of course, the fact that it's the not same flow meant we caught
>> >> >> >> > a
>> >> >> >> > bug
>> >> >> >> > in
>> >> >> >> > LLVM, but that's bound to create more confusion and broken
>> >> >> >> > commits,
>> >> >> >> > which is
>> >> >> >> > worse in the long run.
>> >> >> >>
>> >> >> >> Yeah, unless there's some strong/specific motivation for this I'd
>> >> >> >> be
>> >> >> >> in favor of removing the difference (or removing the Make-based
>> >> >> >> execution entirely)
>> >> >> >>
>> >> >> >> > Also, if we're not running LNT as often as buildbots, the
>> >> >> >> > benefit
>> >> >> >> > of
>> >> >> >> > having
>> >> >> >> > them different is sporadic at best.
>> >> >> >>
>> >> >> >> we're running both pretty regularly, I think - if anything I
>> >> >> >> suspect
>> >> >> >> we might be running LNT on more configurations than the
>> >> >> >> Make-based
>> >> >> >> execution (except that on some LNT runners we're multisampling,
>> >> >> >> so
>> >> >> >> it's slower)
>> >> >> >>
>> >> >> >> > When I set up some tests to run on ARM I have done both direct
>> >> >> >> > and
>> >> >> >> > multi-step, to make sure they were generating the same code and
>> >> >> >> > in
>> >> >> >> > many
>> >> >> >> > cases I found that the order in which the passes were executed
>> >> >> >> > was
>> >> >> >> > breaking
>> >> >> >> > some tests.
>> >> >> >> >
>> >> >> >> > We managed to get the EDG bridge to set it up in the same way
>> >> >> >> > as
>> >> >> >> > the
>> >> >> >> > multi-pass would, so we would get similar results, but it
>> >> >> >> > doesn't
>> >> >> >> > seem
>> >> >> >> > to be
>> >> >> >> > the case with clang.
>> >> >> >> >
>> >> >> >> > cheers,
>> >> >> >> > --renato
>> >> >> >
>> >> >> >
>> >> >
>> >> >
>> >
>> >
>
>



More information about the cfe-dev mailing list