[lldb-dev] New test summary results formatter

Wed Dec 2 21:44:10 PST 2015

Thanks, Zachary.

On Wed, Dec 2, 2015 at 3:59 PM, Zachary Turner <zturner at google.com> wrote:

> Now that it's working in Python 3 and I've had a chance to see the output,
> I think I like it much better.  These are the things I noticed (good and
> bad):
>
> 1) Output of each test is printed on the fly as the test is run.  Much
> better than staring at a black screen for 5 minutes and then seeing a bunch
> of junk fly by.
> 2) Summary output is organized by test function, not by file.  This gives
> better counts.
> 3) Behavior seems different somehow.  My successes / failures / unexpected
> successes seem different when using this formatter.  I'll have to look into
> this to figure out if it's just my imagination or something's actually
> wrong.
>

Let me know if you see anything wrong.  We've been using the Xunit
formatter with our CI for 3-ish months at this point (which runs on the
test event architecture), and have been happy with the results.  Our CI
cleanup still gets timeouts and exceptional exits from the legacy summary,
but I have bugzilla bugs on routing those through the test event system as
well.  That means ResultFormatter implementations can handle them, which is
ultimately what we want.

>
> Will this abstraction of a result formatter provide some way to make a
> formatter intended for machine consumption?
>

Sure.  What did you have in mind?  Some kind of structured output?  (Right
now the Xunit formatter is exactly that - our CI uses the Xunit output and
ingests it into Jenkins, which has an Xunit-based plugin that can interpret
and display the results quite nicely).

> Right now lots of people scrape logs for the purposes of build bots, and
> it would be better if we could just dump the results as json or something.
>
> Only suggestion I have at this point is that in the summary printout the
> lines are too long.
>

Agreed.

> Long test name (module.classname.function) and long filename.  Perhaps the
> filename could be converted to a relative path
>

This was something I was planning on doing.  I like the idea.

> and the classname could be dropped (there's only one class per file
> anyway, so the classname is just wasted space)
>

Part of the reason I included that is I've hit several times where copy and
paste errors lead to the same class name, method name or even file name
being used for a test.  I think, though, that most of those are addressed
by having the path (relative is fine) to the python test file.  I think we
can probably get by with classname.methodname (relative test path).  (From
your other email, I think you nuke the classname and keep the module name,
but I'd probably do the reverse, keeping the class name and getting rid of
the module name since it can be derived from the filename).

Definitely let me know if you find any issues aside from lack of timeout
and exceptional (POSIX-land crashed process) exits.  As you noted and I
described, the counts are all now test method based.

Feel free to play with the Xunit one as well if you're looking for
something machine readable.  And if you're running a test run on Linux, the
curses-based one gives nice output with some user control over what test
results show in the lower portion of the display.  The upper portion shows
you what each of the workers is doing. --curses is the shortcut to get to
it.

-Todd

> On Wed, Dec 2, 2015 at 3:06 PM Todd Fiala <todd.fiala at gmail.com> wrote:
>
>> All good.
>>
>> Yeah Python 2/3 string/byte handling is one of the trickier areas of
>> cross-Python version compatibility.
>>
>> On Wed, Dec 2, 2015 at 2:18 PM, Zachary Turner <zturner at google.com>
>> wrote:
>>
>>>  think I got something working, I'll upload a patch in a few.
>>>
>>> On Wed, Dec 2, 2015 at 2:10 PM Zachary Turner <zturner at google.com>
>>> wrote:
>>>
>>>> BTW, this is what I do in the swig service, and I think it's the
>>>> idiomatic way of sending variable length data over a socket.
>>>>
>>>> On Wed, Dec 2, 2015 at 2:08 PM Zachary Turner <zturner at google.com>
>>>> wrote:
>>>>
>>>>> I think the best solution is going to be to use struct.pack.  Like
>>>>> this on the writing side:
>>>>>
>>>>>         import struct
>>>>>         msg = cPickle.dumps(test_event)
>>>>>         packet = struct.pack("!I%ds" % len(msg), len(msg), msg)
>>>>>         self.out_file.send(packet)
>>>>>
>>>>> and like this on the reading side:
>>>>>
>>>>>                 self.packet_bytes_remaining = struct.unpack("!I",
>>>>> self.header_contents)[0]
>>>>>                 self.header_contents = b""
>>>>>                 self.reading_header = False
>>>>>                 return data[(index+1):]
>>>>>
>>>>> The current solution still doesn't work because it's calling
>>>>> socket.send() with a str instead of a bytes.  So it has to be bytes all the
>>>>> way through.
>>>>>
>>>>> On Wed, Dec 2, 2015 at 1:48 PM Todd Fiala <todd.fiala at gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Those fixes are up here:
>>>>>> r254550
>>>>>>
>>>>>> Let me know what you see after that, Zachary.
>>>>>>
>>>>>> On Wed, Dec 2, 2015 at 1:45 PM, Todd Fiala <todd.fiala at gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Wed, Dec 2, 2015 at 1:42 PM, Todd Fiala <todd.fiala at gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Can you try making those changes in the other spots?  There's a
>>>>>>>> handful of code you have probably not ever run if you haven't selected
>>>>>>>> running with a test results formatter.
>>>>>>>>
>>>>>>>>
>>>>>>> I'm actually going to make the ibuffer ones now since it's easier
>>>>>>> for me to verify that it doesn't break the non-Windows side since I'm right
>>>>>>> in there now.  If that doesn't do it, we'll need to see what else doesn't
>>>>>>> work.
>>>>>>>
>>>>>>>
>>>>>>>> If not, I can try to address them tonight or tomorrow night when I
>>>>>>>> can get to some kind of python 3 setup.
>>>>>>>>
>>>>>>>> On Wed, Dec 2, 2015 at 1:36 PM, Zachary Turner <zturner at google.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Yea I was messing around with it too.  I don't have a fix yet but
>>>>>>>>> I think you will need to either encode the pickled data as utf8, or better
>>>>>>>>> yet, don't write this:
>>>>>>>>>
>>>>>>>>> "{}#{}".format(...)
>>>>>>>>>
>>>>>>>>> because pickled data is supposed to be binary data anyway.  So use
>>>>>>>>> bytes.append() instead.
>>>>>>>>>
>>>>>>>>> Then on the other side in dotest_channels, there's a couple places
>>>>>>>>> where you do something like:
>>>>>>>>>
>>>>>>>>> self.ibuffer = ""
>>>>>>>>>
>>>>>>>>> which would need to change to
>>>>>>>>>
>>>>>>>>> self.ibuffer = b""
>>>>>>>>>
>>>>>>>>> and any other similar operations on self.ibuffer which assume
>>>>>>>>> string data.
>>>>>>>>>
>>>>>>>>> On Wed, Dec 2, 2015 at 1:33 PM Todd Fiala <todd.fiala at gmail.com>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> I think I know how to fix.  Trying now.
>>>>>>>>>>
>>>>>>>>>> On Wed, Dec 2, 2015 at 1:17 PM, Todd Fiala <todd.fiala at gmail.com>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> I think I can fix the issue without you debugging.
>>>>>>>>>>>
>>>>>>>>>>> Getting the single pass test runner to use it isn't impossible
>>>>>>>>>>> but will take some work.  Can you direct-send me the backtrace from the
>>>>>>>>>>> point of failure from your system?  Thanks!
>>>>>>>>>>>
>>>>>>>>>>> -Todd
>>>>>>>>>>>
>>>>>>>>>>> On Wed, Dec 2, 2015 at 12:34 PM, Zachary Turner <
>>>>>>>>>>> zturner at google.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Is there any way to force the single process test runner to use
>>>>>>>>>>>> this same system?  I'm trying to debug the problem, but this codepath
>>>>>>>>>>>> doesn't execute in the single process test runner, and it executes in the
>>>>>>>>>>>> child process in the multiprocess test runner.  Basically I need the
>>>>>>>>>>>> following callstack to execute in the single process test runner:
>>>>>>>>>>>>
>>>>>>>>>>>> Command invoked: C:\Python35\python_d.exe
>>>>>>>>>>>> D:\src\llvm\tools\lldb\test\dotest.py -q --arch=i686 --executable
>>>>>>>>>>>> D:/src/llvmbuild/ninja_py35/bin/lldb.exe -s
>>>>>>>>>>>> D:/src/llvmbuild/ninja_py35/lldb-test-traces -u CXXFLAGS -u CFLAGS
>>>>>>>>>>>> --enable-crash-dialog -C d:\src\llvmbuild\ninja_release\bin\clang.exe
>>>>>>>>>>>> --results-port 60794 --inferior -p TestIntegerTypesExpr.py
>>>>>>>>>>>> D:\src\llvm\tools\lldb\packages\Python\lldbsuite\test --event-add-entries
>>>>>>>>>>>> worker_index=7:int
>>>>>>>>>>>> 411 out of 412 test suites processed - TestIntegerTypesExpr.py
>>>>>>>>>>>> Traceback (most recent call last):
>>>>>>>>>>>>   File "D:\src\llvm\tools\lldb\test\dotest.py", line 7, in
>>>>>>>>>>>> <module>
>>>>>>>>>>>>     lldbsuite.test.run_suite()
>>>>>>>>>>>>   File
>>>>>>>>>>>> "D:\src\llvm\tools\lldb\packages\Python\lldbsuite\test\dotest.py", line
>>>>>>>>>>>> 1476, in run_suite
>>>>>>>>>>>>     setupTestResults()
>>>>>>>>>>>>   File
>>>>>>>>>>>> "D:\src\llvm\tools\lldb\packages\Python\lldbsuite\test\dotest.py", line
>>>>>>>>>>>> 982, in setupTestResults
>>>>>>>>>>>>     results_formatter_object.handle_event(initialize_event)
>>>>>>>>>>>>   File
>>>>>>>>>>>> "D:\src\llvm\tools\lldb\packages\Python\lldbsuite\test\test_results.py",
>>>>>>>>>>>> line 1033, in handle_event
>>>>>>>>>>>>     "{}#{}".format(len(pickled_message), pickled_message))
>>>>>>>>>>>> TypeError: a bytes-like object is required, not 'str'
>>>>>>>>>>>>
>>>>>>>>>>>> On Wed, Dec 2, 2015 at 11:40 AM Zachary Turner <
>>>>>>>>>>>> zturner at google.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> When I run this under Python 3 I get "A bytes object is used
>>>>>>>>>>>>> like a string" on Line 1033 of test_results.py.  I'm going to dig into it a
>>>>>>>>>>>>> little bit, but maybe you know off the top of your head the right way to
>>>>>>>>>>>>> fix it.
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Wed, Dec 2, 2015 at 11:32 AM Zachary Turner <
>>>>>>>>>>>>> zturner at google.com> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Oh yea, I made up that decorator idea because I didn't know
>>>>>>>>>>>>>> all the formatters were derived from a common base.  But your idea is
>>>>>>>>>>>>>> better if everything is derived from a common base.  To be honest you could
>>>>>>>>>>>>>> even just generate an error if there are two ResultsFormatter derived
>>>>>>>>>>>>>> classes in the same module.  We should be encouraging more smaller files
>>>>>>>>>>>>>> with single responsibility.  One of the things I plan to do as part of some
>>>>>>>>>>>>>> cleanup in a week or two is to split up dotest, dosep, and lldbtest.py into
>>>>>>>>>>>>>> a couple different files by breaking out things like TestBase, etc into
>>>>>>>>>>>>>> separate files.  So that it's easier to keep a mental map of where
>>>>>>>>>>>>>> different code is.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Wed, Dec 2, 2015 at 11:26 AM Todd Fiala <
>>>>>>>>>>>>>> todd.fiala at gmail.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Wed, Dec 2, 2015 at 11:20 AM, Todd Fiala <
>>>>>>>>>>>>>>> todd.fiala at gmail.com> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Yeah I'd be good with that.  I can change that as well.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> -Todd
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Wed, Dec 2, 2015 at 11:10 AM, Zachary Turner <
>>>>>>>>>>>>>>>> zturner at google.com> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Also another stylistic suggestion.  I've been thinking
>>>>>>>>>>>>>>>>> about how to more logically organize all the source files now that we have
>>>>>>>>>>>>>>>>> a package.  So it makes sense conceptually to group all of the different
>>>>>>>>>>>>>>>>> result formatters under a subpackage called formatters.  So right now
>>>>>>>>>>>>>>>>> you've got lldbsuite.test.basic_results_formatter.
>>>>>>>>>>>>>>>>> BasicResultsFormatter but it might make sense for this to
>>>>>>>>>>>>>>>>> be lldbsuite.test.formatters.basic.BasicResultsFormatter.  If you do things
>>>>>>>>>>>>>>>>> this way, it can actually result in a substantially shorter command line,
>>>>>>>>>>>>>>>>> because the --results-formatter option can use lldbsuite.test.formatters as
>>>>>>>>>>>>>>>>> a starting point.  So you could instead write:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> test/dotest.py --results-formatter basic
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> dotest then looks for a `basic.py` module in the
>>>>>>>>>>>>>>>>> `lldbsuite.test.formatters` package, looks for a class inside with a
>>>>>>>>>>>>>>>>> @result_formatter decorator, and instantiates that.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> This has the advantage of making the command line shorter
>>>>>>>>>>>>>>>>> *and* a more logical source file organization.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> The other thing that could allow me to do is possibly
>>>>>>>>>>>>>>> short-circuit the results formatter specifier so that, if just the module
>>>>>>>>>>>>>>> is specified, and if the module only has one ResultsFormatter-derived
>>>>>>>>>>>>>>> class, I can probably rig up code that figures out the right results
>>>>>>>>>>>>>>> formatter, shortening the required discriminator to something even shorter
>>>>>>>>>>>>>>> (i.e. module.classname becomes just module.)
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Wed, Dec 2, 2015 at 11:04 AM Zachary Turner <
>>>>>>>>>>>>>>>>> zturner at google.com> wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Can --results-file=stdout be the default so that we don't
>>>>>>>>>>>>>>>>>> have to specify that?
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On Wed, Dec 2, 2015 at 11:02 AM Todd Fiala via lldb-dev <
>>>>>>>>>>>>>>>>>> lldb-dev at lists.llvm.org> wrote:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Also, all the text in the summary is fixed-width lined
>>>>>>>>>>>>>>>>>>> up nicely, which may not show in the commit message description if you're
>>>>>>>>>>>>>>>>>>> using a variable-width font.  On a terminal it looks nice.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> On Wed, Dec 2, 2015 at 11:01 AM, Todd Fiala <
>>>>>>>>>>>>>>>>>>> todd.fiala at gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> On Wed, Dec 2, 2015 at 10:57 AM, Todd Fiala <
>>>>>>>>>>>>>>>>>>>> todd.fiala at gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Hi all,
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> I just put up an optional test results formatter that
>>>>>>>>>>>>>>>>>>>>> is a prototype of what we may move towards for our default test summary
>>>>>>>>>>>>>>>>>>>>> results.  It went in here:
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> r254530
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> and you can try it out with something like:
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> time test/dotest.py --executable
>>>>>>>>>>>>>>>>>>>>> `pwd`/build/Debug/lldb --results-formatter
>>>>>>>>>>>>>>>>>>>>> lldbsuite.test.basic_results_formatter.BasicResultsFormatter --results-file
>>>>>>>>>>>>>>>>>>>>> st
>>>>>>>>>>>>>>>>>>>>> out
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> I cut and paste my line, but more than likely for most
>>>>>>>>>>>>>>>>>>>> people you'd just want this:
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> test/dotest.py --results-formatter
>>>>>>>>>>>>>>>>>>>> lldbsuite.test.basic_results_formatter.BasicResultsFormatter --results-file
>>>>>>>>>>>>>>>>>>>> stdout
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> The other stuff was specific to my setup.  That line
>>>>>>>>>>>>>>>>>>>> assumes you run from the lldb source dir root.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Let me know if this satisfies the basic needs of counts
>>>>>>>>>>>>>>>>>>>>> and whatnot.  It counts test method runs rather than all the oddball "file,
>>>>>>>>>>>>>>>>>>>>> class, etc." counts we had before.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> It prints out the Details section when there are
>>>>>>>>>>>>>>>>>>>>> details, and keeps it nice and clean when there are none.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> It also mentions a bit about test reruns up top, but
>>>>>>>>>>>>>>>>>>>>> that won't come into play until I get the multi-test-pass,
>>>>>>>>>>>>>>>>>>>>> single-worker/low-load mechanism in place, which will depend on newer rerun
>>>>>>>>>>>>>>>>>>>>> count awareness support.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> The change also cleans up places where the test event
>>>>>>>>>>>>>>>>>>>>> framework was using string codes and replaces them with symbolic constants.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Let me know what you think.  I can tweak it as needed
>>>>>>>>>>>>>>>>>>>>> to address testbot and other needs.  Once it looks reasonable, I'd like to
>>>>>>>>>>>>>>>>>>>>> move over to using it by default in the parallel test runner rather than
>>>>>>>>>>>>>>>>>>>>> the legacy support.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Thanks!
>>>>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>>>>> -Todd
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>>>> -Todd
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>>> -Todd
>>>>>>>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>>>>>>>> lldb-dev mailing list
>>>>>>>>>>>>>>>>>>> lldb-dev at lists.llvm.org
>>>>>>>>>>>>>>>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>> -Todd
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>> -Todd
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>> -Todd
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> -Todd
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> -Todd
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> -Todd
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> -Todd
>>>>>>
>>>>>
>>
>>
>> --
>> -Todd
>>
>

-- 
-Todd
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/lldb-dev/attachments/20151202/e6ebe6c5/attachment-0001.html>