[cfe-dev] Test Suite - Livermore Loops

Mon Jan 7 14:23:42 PST 2013

On Mon, Jan 7, 2013 at 2:17 PM, David Blaikie <dblaikie at gmail.com> wrote:

> On Mon, Jan 7, 2013 at 2:05 PM, Daniel Dunbar <daniel at zuster.org> wrote:
> >
> >
> >
> > On Mon, Jan 7, 2013 at 1:52 PM, David Blaikie <dblaikie at gmail.com>
> wrote:
> >>
> >> On Mon, Jan 7, 2013 at 1:46 PM, Daniel Dunbar <daniel at zuster.org>
> wrote:
> >> >
> >> >
> >> >
> >> > On Mon, Jan 7, 2013 at 1:14 PM, David Blaikie <dblaikie at gmail.com>
> >> > wrote:
> >> >>
> >> >> On Mon, Jan 7, 2013 at 12:58 PM, Daniel Dunbar <daniel at zuster.org>
> >> >> wrote:
> >> >> > To weigh in here...
> >> >> >
> >> >> >
> >> >> > On Thu, Jan 3, 2013 at 8:15 AM, David Blaikie <dblaikie at gmail.com>
> >> >> > wrote:
> >> >> >>
> >> >> >> +Daniel & Michael who work on the LNT infrastructure & might have
> >> >> >> some
> >> >> >> thoughts on the differences & their merits & motivations.
> >> >> >>
> >> >> >> On Thu, Jan 3, 2013 at 4:05 AM, Renato Golin
> >> >> >> <renato.golin at linaro.org>
> >> >> >> wrote:
> >> >> >> > David,
> >> >> >> >
> >> >> >> > I got some more work on the Livermore Loops and I found out that
> >> >> >> > the
> >> >> >> > issue
> >> >> >> > is the difference in the parameters between a single step and a
> >> >> >> > multi
> >> >> >> > step
> >> >> >> > compilation.
> >> >> >>
> >> >> >> Thanks for the investigation.
> >> >> >>
> >> >> >> > When you compile "clang kernel06.c" it works fine, but when you
> >> >> >> > get
> >> >> >> > all
> >> >> >> > steps (clang -emit-llvm + llvm-as + opt + llc etc), the defaults
> >> >> >> > options
> >> >> >> > of
> >> >> >> > each and how they interact is showing a bug in the code
> generated.
> >> >> >>
> >> >> >> Sounds quite plausible.
> >> >> >>
> >> >> >> > This difference is due to the fact that I'm running the
> test-suite
> >> >> >> > using
> >> >> >> > LNT, while the build bots are running it using Make directly.
> I'd
> >> >> >> > expect
> >> >> >> > them both to be the same, but apparently they're quite different
> >> >> >> > in
> >> >> >> > what
> >> >> >> > kind of parameters they use, passes they test and results they
> >> >> >> > get.
> >> >> >> >
> >> >> >> > I think there are two courses of action here:
> >> >> >> >
> >> >> >> > 1. Identify the issue, isolate the case and create a bug to
> >> >> >> > resolve
> >> >> >> > later.
> >> >> >> > 2. Make sure LNT does exactly what the build bots are doing
> >> >> >>
> >> >> >> Part of the issue here is whether or not the Make-based execution
> is
> >> >> >> still maintained/valued. I'm getting the impression that the LNT
> >> >> >> execution may be already, or be becoming, the standard way to run
> >> >> >> the
> >> >> >> test suite even when not gathering perf statistics.
> Michael/Daniel -
> >> >> >> is that the case?
> >> >> >
> >> >> >
> >> >> > Well, the distinction isn't really between LNT and non-LNT, its
> >> >> > between
> >> >> > the
> >> >> > TEST=nightly and TEST=simple style supported by the Makefiles. LNT
> >> >> > uses
> >> >> > the
> >> >> > TEST=simple style and that is all I care to support.
> >> >>
> >> >> Fair enough, though that's sort of what I was getting at in a way:
> >> >> whatever way LNT is driving the test-suite is, essentially, the only
> >> >> supported way. Sure we can have non-LNT bots (not ideal, perhaps -
> >> >> still another path to maintain/possibly diverge by accident) but they
> >> >> certainly shouldn't be using anything other than the way LNT uses the
> >> >> test-suite (ie: TEST=simple).
> >> >>
> >> >> Can we kill TEST=nightly, then, since it's just an
> >> >> untested/unsupported trap? Or do you know of users that have a need
> >> >> for this?
> >> >
> >> >
> >> > It's untested, but as supported as anything else (I try not to break
> it,
> >> > and
> >> > will fix bugs in it).
> >> >
> >> > And yes, there are still users that use this regularly. Most of that
> is
> >> > probably habit among old-school LLVMers, but it's still useful when
> you
> >> > want
> >> > to do direct A/B testing of optimizer changes (support for things like
> >> > OPTBETA and LLCBETA), or when you want to test a change without
> >> > requiring a
> >> > compiler rebuild.
> >> >
> >> > For example, we still don't have very good support in the compiler for
> >> > tweaking various parts of the compilation process (for example,
> running
> >> > with
> >> > a custom pass list), so the easiest way to test addition of a new pass
> >> > may
> >> > still be using TEST=nightly.
> >> >
> >> > My natural tendency is towards "if it isn't broke, don't kill it", and
> >> > not
> >> > to try and remove it until we have a new separate way of running the
> >> > test
> >> > suite outside of the Makefiles.
> >> >
> >> >>
> >> >> >
> >> >> > Historically, the old way of testing (TEST=nightly) used the
> various
> >> >> > LLVM
> >> >> > tools to effect a compilation because there weren't compilers that
> >> >> > worked.
> >> >> > However, this is a bad way to "test" the product that most users
> >> >> > actually
> >> >> > care about, which is the compiler.
> >> >> >
> >> >> > With TEST=simple, all the compilation is done using the compiler
> just
> >> >> > as
> >> >> > an
> >> >> > end user would. If you want LTO, the right way to get it is to use
> >> >> > the
> >> >> > compilers support for LTO. This is how we test LTO internally. I've
> >> >> > never
> >> >> > tried to get LTO working on Linux, but it should be possible using
> >> >> > the
> >> >> > gold
> >> >> > plugin and passing the right compiler options.
> >> >> >
> >> >> >> If so, should we rip out the direct Make execution, or do
> something
> >> >> >> to
> >> >> >> otherwise warn/disable it?
> >> >> >
> >> >> >
> >> >> > Per my other thread polling users of the test-suite, there are
> still
> >> >> > people
> >> >> > who use the Makefiles to do more custom things. I personally would
> >> >> > love
> >> >> > to
> >> >> > deprecate them completely, but they do support some useful
> workflows.
> >> >> >
> >> >> > My ideal would be:
> >> >> > 1. Migrate LNT to drive the test-suite using a more sane mechanism
> >> >> > (not
> >> >> > a
> >> >> > glob of Makefiles). I would like the "more sane mechanism" to be
> >> >> > lit-based.
> >> >> > 2. Maybe do some work to make using lit to drive the test-suite
> more
> >> >> > convenient and hopefully support some of the useful workflows the
> >> >> > Makefiles
> >> >> > support with less of the crap.
> >> >> > 3. Deprecate the Makefiles, or at least let the die through lack of
> >> >> > maintenance.
> >> >> >
> >> >> > Does that answer the parts you wanted my input on?
> >> >>
> >> >> More or less, I suppose I wouldn't mind an opinion on the "should we
> >> >> kill off/migrate bots from test-suite invocation to LNT?" issue too.
> >> >> (my assumption is that your answer to that is "yes", but just want to
> >> >> be clear)
> >> >
> >> >
> >> > Yes, definitely.
> >>
> >> Hmm, this seems at odds with your above opinion on not killing
> >> TEST=nightly, though. If we actively migrate bots away from
> >> TEST=nightly we're going to break it (indeed we Renato already has
> >> which is how this thread came up).
> >
> >
> > By "not breaking it", I meant the infrastructure of it, not whether or
> not
> > the tests work or not.
> >
> > As for the actual LLVM bug, we should probably try and get an LTO LNT bot
> > up, though, which would most likely hit the same bug.
> >
> >> If it's broken that, I would think,
> >> is going to cause some confusion/problems for those using it &
> >> expecting things to pass. Is your impression/experience that those
> >> running this manually for custom testing aren't too concerned about
> >> spurious failures?
> >
> >
> > In my mind, I don't necessarily think "its broken" makes sense in this
> > context. It's not TEST=nightly that is broken, it is the compiler in the
> > context of a one architecture, one set of compile options, etc.
>
> Agreed. There's a distinction between the infrastructure being broken
> & the tests not passing.
>
> > I expect/hope developers to be aware that compiler bugs may only manifest
> > under a very specific set of circumstances, and so they need to run their
> > tests in the same way as the buildbots if they want results to match.
> And I
> > hope most core LLVM developers realize that the way TEST=nightly ends up
> > building binaries is very different from using the compiler directly,
> but if
> > not most of them figure this out very quickly in practice.
>
> Right, my concern is that if we leave "TEST=nightly" in it'll just be
> a trap for people to run that & expect to get clean results when
> there's no infrastructure ensuring that those tests pass at all on any
> architecture.

I don't think its really documented and the current docs steer towards LNT
or TEST=simple, so I don't think this is a big problem.

> It'll easily become a hive of arbitrary failures & no
> clear way to distinguish new failures from old (which will result in
> people either not using it or trying to investigate issues/failures
> that weren't introduced by their change anyway (or having to run it
> twice - once to get a baseline and again to get their results &
> carefully diffing between the two to see where the new failures are))
>

If it becomes a hive of arbitrary failures that means that there are a
bunch of untested and buggy paths in LLVM, so that is a much bigger problem
and indicates a lack of test coverage.

This seems likely to waste engineer time & removing the option would
> remove the pitfall/trap. But perhaps people using it are used to
> arbitrary failures? I'm not sure.
>

I wouldn't say they are used to arbitrary failures, but its not much
different than finding a bug in some other part of LLVM when you change
some unrelated part of codegen. This happens pretty frequently just due to
the nature of compilers.

 - Daniel

> > I see LNT + TEST=simple as the "right way" to do large scale testing w/
> the
> > test suite and buildbots, but if developers want to use TEST=nightly for
> > experiments or development its still there. I have actively tried to
> > encourage people to switch over to using TEST=simple when possible, but
> its
> > hard to get people to change existing workflows if there isn't a clear
> > benefit.
> >
> >  - Daniel
> >
> >>
> >>
> >> - David
> >>
> >> >
> >> >  - Daniel
> >> >
> >> >>
> >> >>
> >> >> - David
> >> >>
> >> >> >
> >> >> >  - Daniel
> >> >> >
> >> >> >>
> >> >> >> > I'm working on item 1 right now, not sure how item 2 can be
> >> >> >> > solved...
> >> >> >> >
> >> >> >> > Of course, the fact that it's the not same flow meant we caught
> a
> >> >> >> > bug
> >> >> >> > in
> >> >> >> > LLVM, but that's bound to create more confusion and broken
> >> >> >> > commits,
> >> >> >> > which is
> >> >> >> > worse in the long run.
> >> >> >>
> >> >> >> Yeah, unless there's some strong/specific motivation for this I'd
> be
> >> >> >> in favor of removing the difference (or removing the Make-based
> >> >> >> execution entirely)
> >> >> >>
> >> >> >> > Also, if we're not running LNT as often as buildbots, the
> benefit
> >> >> >> > of
> >> >> >> > having
> >> >> >> > them different is sporadic at best.
> >> >> >>
> >> >> >> we're running both pretty regularly, I think - if anything I
> suspect
> >> >> >> we might be running LNT on more configurations than the Make-based
> >> >> >> execution (except that on some LNT runners we're multisampling, so
> >> >> >> it's slower)
> >> >> >>
> >> >> >> > When I set up some tests to run on ARM I have done both direct
> and
> >> >> >> > multi-step, to make sure they were generating the same code and
> in
> >> >> >> > many
> >> >> >> > cases I found that the order in which the passes were executed
> was
> >> >> >> > breaking
> >> >> >> > some tests.
> >> >> >> >
> >> >> >> > We managed to get the EDG bridge to set it up in the same way as
> >> >> >> > the
> >> >> >> > multi-pass would, so we would get similar results, but it
> doesn't
> >> >> >> > seem
> >> >> >> > to be
> >> >> >> > the case with clang.
> >> >> >> >
> >> >> >> > cheers,
> >> >> >> > --renato
> >> >> >
> >> >> >
> >> >
> >> >
> >
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20130107/2d1c58d2/attachment.html>