[zorg] r174138 - LitTestCommand: Fix some log parsing bugs.

Fri Feb 1 09:24:47 PST 2013

On Fri, Feb 1, 2013 at 9:08 AM, Daniel Dunbar <daniel at zuster.org> wrote:
> On Fri, Feb 1, 2013 at 8:36 AM, David Blaikie <dblaikie at gmail.com> wrote:
>>
>> On Thu, Jan 31, 2013 at 5:39 PM, Daniel Dunbar <daniel at zuster.org> wrote:
>> > Author: ddunbar
>> > Date: Thu Jan 31 19:39:53 2013
>> > New Revision: 174138
>> >
>> > URL: http://llvm.org/viewvc/llvm-project?rev=174138&view=rev
>> > Log:
>> > LitTestCommand: Fix some log parsing bugs.
>> >
>> >  - The last test failure would not get recorded properly.
>> >
>> >  - The log names were wrong sometimes because of a variable shadowing
>> > bug.
>> >
>> >  - Also, always parse verbose log info even if it isn't associated with
>> > a
>> >    failing code.
>>
>> Where/when/why does this happen?
>
>
> It happens with LNT logs sometimes as in
> REGRESSED: foo
> *** ...
> no useful information here
> ***

Why does LNT do that? That seems wholely unhelpful. (& it seems like
the only thing it logs context for - why not just remove that
functionality from LNT entirely? I can't imagine that'd be difficult
to do & it's not adding any value to the logs)

>
>>
>> & what does your change cause us to
>> do with such logs? Log them or just parse & discard them?
>
>
> Yes.
>
>>
>> (the latter
>> seems to lack fidelity with the original log - seems we should
>> probably log them in the same way)
>
>
> We could, its easier to revisit this in the future. The main point of this
> commit was to fix the other parsing bugs.
>
>
>>
>>
>> >
>> > Modified:
>> >     zorg/trunk/zorg/buildbot/commands/LitTestCommand.py
>> >
>> > Modified: zorg/trunk/zorg/buildbot/commands/LitTestCommand.py
>> > URL:
>> > http://llvm.org/viewvc/llvm-project/zorg/trunk/zorg/buildbot/commands/LitTestCommand.py?rev=174138&r1=174137&r2=174138&view=diff
>> >
>> > ==============================================================================
>> > --- zorg/trunk/zorg/buildbot/commands/LitTestCommand.py (original)
>> > +++ zorg/trunk/zorg/buildbot/commands/LitTestCommand.py Thu Jan 31
>> > 19:39:53 2013
>> > @@ -8,54 +8,93 @@ from buildbot.process.buildstep import L
>> >  from buildbot.steps.shell import Test
>> >
>> >  class LitLogObserver(LogLineObserver):
>> > -  kTestLineRE = re.compile(r'([^ ]*): (.*) \(.*.*\)')
>> > -  kTestFailureLogStartRE = re.compile(r"""\*{4,80} TEST '(.*)' .*""")
>> > -  kTestFailureLogStopRE = re.compile(r"""\*{10,80}""")
>> > +  # Regular expressions for a regular test line.
>> > +  kTestLineRE = re.compile(r'([^ ]*): (.*) \(.*\)')
>> > +
>> > +  # Regular expressions for verbose log start and stop markers. Verbose
>> > log
>> > +  # output always immediately follow a test.
>> > +  kTestVerboseLogStartRE = re.compile(r"""\*{4,80} TEST '(.*)' .*""")
>> > +  kTestVerboseLogStopRE = re.compile(r"""\*{10,80}""")
>> > +
>> > +  # These are the codes for which we will include the log output in the
>> > buildbot
>> > +  # step results.
>> >    failingCodes = set(['FAIL', 'XPASS', 'KPASS', 'UNRESOLVED'])
>> > +
>> >    def __init__(self):
>> >      LogLineObserver.__init__(self)
>> >      self.resultCounts = {}
>> > -    self.inFailure = None
>> > -    self.inFailureContext = False
>> > +
>> > +    # If non-null, a tuple of the last test code and name.
>> > +    self.lastTestResult = None
>> > +
>> > +    # If non-null, a list of lines in the current log.
>> > +    self.activeVerboseLog = None
>>
>> & if we did, this variable would presumably get rolled into the
>> previous one, since they'd have the same kind of lifetime
>>
>> >
>> >    def hadFailure(self):
>> >      for code in self.failingCodes:
>> >        if self.resultCounts.get(code):
>> >          return True
>> >
>> > +  def handleVerboseLogLine(self, line):
>> > +    # If this isn't a stop marker, just append it and continue.
>> > +    if not self.kTestVerboseLogStopRE.match(line):
>> > +      self.activeVerboseLog.append(line)
>> > +      return
>> > +
>> > +    # Otherwise, the log and test info are finished.
>> > +    self.testInfoFinished()
>> > +
>> > +  def testInfoFinished(self):
>> > +    # We have finished getting information for one test, handle it.
>> > +    code, name = self.lastTestResult
>> > +
>> > +    # If the test failed, add a log entry for it.
>> > +    if code in self.failingCodes:
>> > +      # If a verbose log was not provided, just add a one line
>> > description.
>> > +      if self.activeVerboseLog is None:
>> > +        self.activeVerboseLog = ['%s: %s' % (code, name)]
>>
>> Why not just always include the status line at the start? - I thought
>> that's what I had this setup to do once you decided to support
>> context-less failures.
>>
>> > +
>> > +      # Add the log to the build status.
>> > +      self.step.addCompleteLog(name.replace('/', '__'),
>> > +                               '\n'.join(self.activeVerboseLog))
>> > +
>> > +    # Reset the current state.
>> > +    self.lastTestResult = None
>> > +    self.activeVerboseLog = None
>> > +
>> >    def outLineReceived(self, line):
>> > -    # See if we are inside a failure log.
>> > -    if self.inFailureContext:
>> > -      self.inFailure[1].append(line)
>> > -      if self.kTestFailureLogStopRE.match(line):
>> > -        self.inFailureContext = False
>> > +    # If we are inside a verbose log, just accumulate lines until we
>> > reach the
>> > +    # stop marker.
>> > +    if self.activeVerboseLog is not None:
>> > +      self.handleVerboseLogLine(line)
>> >        return
>> >
>> > -    line = line.strip()
>> > -    if not line:
>> > +    # Check for the test verbose log start marker.
>> > +    m = self.kTestVerboseLogStartRE.match(line.strip())
>> > +    if m:
>> > +      self.activeVerboseLog = []
>> > +      if m.group(1) != self.lastTestResult[1]:
>> > +        # This is bogus, the verbose log test name doesn't match what
>> > we
>> > +        # expect. Just note it in the log but otherwise accumulate as
>> > normal.
>> > +        self.activeVerboseLog.append(
>> > +          "error: verbose log output name didn't match expected test
>> > name")
>> >        return
>> >
>> > -    # Check for test failure logs.
>> > -    m = self.kTestFailureLogStartRE.match(line)
>> > -    if m and self.inFailure:
>> > -      if m.group(1)==self.inFailure[0]:
>> > -          self.inFailure[1].append(line)
>> > -          self.inFailureContext = True
>> > -          return
>> > -      else:
>> > -          msg = 'm[0]: %r\ninFailure[0]: %r' % (m.group(0),
>> > self.inFailure[0])
>> > -          raise ValueError, msg
>> > -    # Otherwise expect a test status line.
>> > -    m = self.kTestLineRE.match(line)
>> > +    # Otherwise, if we had any previous test consider it finished.
>> > +    #
>> > +    # FIXME: This assumes that we will always have at least one line
>> > following
>> > +    # the last test result to properly record each test; we could fix
>> > this if
>> > +    # buildbot provided us a hook for when the log is done.
>> > +    if self.lastTestResult:
>> > +      self.testInfoFinished()
>> > +
>> > +    # Check for a new test status line.
>> > +    m = self.kTestLineRE.match(line.strip())
>> >      if m:
>> > -      code, name = m.groups()
>> > -      if self.inFailure and name != self.inFailure[0]:
>> > -        name,log = self.inFailure
>> > -        self.step.addCompleteLog(name.replace('/', '__'),
>> > '\n'.join(log))
>> > -        self.inFailure = None
>> > -      if code in self.failingCodes:
>> > -        self.inFailure = (name, [line])
>> > +      # Remember the last test result and update the result counts.
>> > +      self.lastTestResult = (code, name) = m.groups()
>> >        self.resultCounts[code] = self.resultCounts.get(code, 0) + 1
>> > +      return
>> >
>> >  class LitTestCommand(Test):
>> >    resultNames = {'FAIL':'unexpected failures',
>> > @@ -124,7 +163,23 @@ PASS: test-three (3 of 3)
>> >  """)
>> >
>> >      self.assertEqual(obs.resultCounts, { 'FAIL' : 1, 'PASS' : 2 })
>> > -    self.assertEqual(obs.step.logs, [('test-two', 'FAIL: test-two (2 of
>> > 3)')])
>> > +    self.assertEqual(obs.step.logs, [('test-two', 'FAIL: test-two')])
>> > +
>> > +  def test_verbose_logs(self):
>> > +    obs = self.parse_log("""
>> > +FAIL: test-one (1 of 3)
>> > +FAIL: test-two (2 of 3)
>> > +**** TEST 'test-two' ****
>> > +bla bla bla
>> > +**********
>> > +FAIL: test-three (3 of 3)
>> > +""")
>> > +
>> > +    self.assertEqual(obs.resultCounts, { 'FAIL' : 3 })
>> > +    self.assertEqual(obs.step.logs, [
>> > +        ('test-one', 'FAIL: test-one'),
>> > +        ('test-two', 'bla bla bla'),
>> > +        ('test-three', 'FAIL: test-three')])
>> >
>> >  class TestCommand(unittest.TestCase):
>> >    def parse_log(self, text):
>>
>> All in all these changes seem to have made this code much more
>> complicated for cases I don't really understand & don't think I've
>> ever seen come from lit.
>
>
> These cases match what the old code handled and what my understanding of the
> lit format is and how LNT uses it. The test cases I added are exactly things
> that I want this code to be able to handle.

It seems like a strange format, then. Essentially any disparity
between what the buildbot reports & what the underlying tools report
seems like it's a bug/less-than-ideal-feature in the underlying tools
we should fix (since it's in our power to do so & the fact that we're
massaging/transforming the data when reporting to the buildbot seems
to indicate that we think the original data wasn't in the right format
to begin with).

Why would we log context for passing results? If we consider it not
worthwhile to report up to the build results, why would it be
worthwhile to report to users not using the buildbots?

Why do we not log context for failing results? (this appears to only
happen for LNT - does lit ever do this?) LNT clearly has them in other
logs, it's just not surfacing it. I don't see why we would /want/ to
handle these things in the Buildbot infrastructure (yes, I agree it
might be a short term solution - but not the end state we'd be aiming
for)

> While I appreciate your wanting to improve this code, the cases it handled
> are used and just breaking them without understanding why they were there is
> not cool.

Agreed - hence why I asked about what changes I was making & requested
code review (I discussed the general plan on the lab list, asked
whether people preferred pre or post-commit review, etc). Sorry about
the breakage. Didn't really get any /review/ just "hey, this broke
stuff" and various committed changes without a lot of
explanation/discussion.

Thanks for adding the tests/test infrastructure.

- David