[zorg] r174138 - LitTestCommand: Fix some log parsing bugs.

Fri Feb 1 10:04:46 PST 2013

On Fri, Feb 1, 2013 at 9:24 AM, David Blaikie <dblaikie at gmail.com> wrote:

> On Fri, Feb 1, 2013 at 9:08 AM, Daniel Dunbar <daniel at zuster.org> wrote:
> > On Fri, Feb 1, 2013 at 8:36 AM, David Blaikie <dblaikie at gmail.com>
> wrote:
> >>
> >> On Thu, Jan 31, 2013 at 5:39 PM, Daniel Dunbar <daniel at zuster.org>
> wrote:
> >> > Author: ddunbar
> >> > Date: Thu Jan 31 19:39:53 2013
> >> > New Revision: 174138
> >> >
> >> > URL: http://llvm.org/viewvc/llvm-project?rev=174138&view=rev
> >> > Log:
> >> > LitTestCommand: Fix some log parsing bugs.
> >> >
> >> >  - The last test failure would not get recorded properly.
> >> >
> >> >  - The log names were wrong sometimes because of a variable shadowing
> >> > bug.
> >> >
> >> >  - Also, always parse verbose log info even if it isn't associated
> with
> >> > a
> >> >    failing code.
> >>
> >> Where/when/why does this happen?
> >
> >
> > It happens with LNT logs sometimes as in
> > REGRESSED: foo
> > *** ...
> > no useful information here
> > ***
>
> Why does LNT do that? That seems wholely unhelpful. (& it seems like
> the only thing it logs context for - why not just remove that
> functionality from LNT entirely? I can't imagine that'd be difficult
> to do & it's not adding any value to the logs)
>

I agree, it was a proxy for providing more information but that code hasn't
been written yet. :)

> >
> >>
> >> & what does your change cause us to
> >> do with such logs? Log them or just parse & discard them?
> >
> >
> > Yes.
> >
> >>
> >> (the latter
> >> seems to lack fidelity with the original log - seems we should
> >> probably log them in the same way)
> >
> >
> > We could, its easier to revisit this in the future. The main point of
> this
> > commit was to fix the other parsing bugs.
> >
> >
> >>
> >>
> >> >
> >> > Modified:
> >> >     zorg/trunk/zorg/buildbot/commands/LitTestCommand.py
> >> >
> >> > Modified: zorg/trunk/zorg/buildbot/commands/LitTestCommand.py
> >> > URL:
> >> >
> http://llvm.org/viewvc/llvm-project/zorg/trunk/zorg/buildbot/commands/LitTestCommand.py?rev=174138&r1=174137&r2=174138&view=diff
> >> >
> >> >
> ==============================================================================
> >> > --- zorg/trunk/zorg/buildbot/commands/LitTestCommand.py (original)
> >> > +++ zorg/trunk/zorg/buildbot/commands/LitTestCommand.py Thu Jan 31
> >> > 19:39:53 2013
> >> > @@ -8,54 +8,93 @@ from buildbot.process.buildstep import L
> >> >  from buildbot.steps.shell import Test
> >> >
> >> >  class LitLogObserver(LogLineObserver):
> >> > -  kTestLineRE = re.compile(r'([^ ]*): (.*) \(.*.*\)')
> >> > -  kTestFailureLogStartRE = re.compile(r"""\*{4,80} TEST '(.*)' .*""")
> >> > -  kTestFailureLogStopRE = re.compile(r"""\*{10,80}""")
> >> > +  # Regular expressions for a regular test line.
> >> > +  kTestLineRE = re.compile(r'([^ ]*): (.*) \(.*\)')
> >> > +
> >> > +  # Regular expressions for verbose log start and stop markers.
> Verbose
> >> > log
> >> > +  # output always immediately follow a test.
> >> > +  kTestVerboseLogStartRE = re.compile(r"""\*{4,80} TEST '(.*)' .*""")
> >> > +  kTestVerboseLogStopRE = re.compile(r"""\*{10,80}""")
> >> > +
> >> > +  # These are the codes for which we will include the log output in
> the
> >> > buildbot
> >> > +  # step results.
> >> >    failingCodes = set(['FAIL', 'XPASS', 'KPASS', 'UNRESOLVED'])
> >> > +
> >> >    def __init__(self):
> >> >      LogLineObserver.__init__(self)
> >> >      self.resultCounts = {}
> >> > -    self.inFailure = None
> >> > -    self.inFailureContext = False
> >> > +
> >> > +    # If non-null, a tuple of the last test code and name.
> >> > +    self.lastTestResult = None
> >> > +
> >> > +    # If non-null, a list of lines in the current log.
> >> > +    self.activeVerboseLog = None
> >>
> >> & if we did, this variable would presumably get rolled into the
> >> previous one, since they'd have the same kind of lifetime
> >>
> >> >
> >> >    def hadFailure(self):
> >> >      for code in self.failingCodes:
> >> >        if self.resultCounts.get(code):
> >> >          return True
> >> >
> >> > +  def handleVerboseLogLine(self, line):
> >> > +    # If this isn't a stop marker, just append it and continue.
> >> > +    if not self.kTestVerboseLogStopRE.match(line):
> >> > +      self.activeVerboseLog.append(line)
> >> > +      return
> >> > +
> >> > +    # Otherwise, the log and test info are finished.
> >> > +    self.testInfoFinished()
> >> > +
> >> > +  def testInfoFinished(self):
> >> > +    # We have finished getting information for one test, handle it.
> >> > +    code, name = self.lastTestResult
> >> > +
> >> > +    # If the test failed, add a log entry for it.
> >> > +    if code in self.failingCodes:
> >> > +      # If a verbose log was not provided, just add a one line
> >> > description.
> >> > +      if self.activeVerboseLog is None:
> >> > +        self.activeVerboseLog = ['%s: %s' % (code, name)]
> >>
> >> Why not just always include the status line at the start? - I thought
> >> that's what I had this setup to do once you decided to support
> >> context-less failures.
> >>
> >> > +
> >> > +      # Add the log to the build status.
> >> > +      self.step.addCompleteLog(name.replace('/', '__'),
> >> > +                               '\n'.join(self.activeVerboseLog))
> >> > +
> >> > +    # Reset the current state.
> >> > +    self.lastTestResult = None
> >> > +    self.activeVerboseLog = None
> >> > +
> >> >    def outLineReceived(self, line):
> >> > -    # See if we are inside a failure log.
> >> > -    if self.inFailureContext:
> >> > -      self.inFailure[1].append(line)
> >> > -      if self.kTestFailureLogStopRE.match(line):
> >> > -        self.inFailureContext = False
> >> > +    # If we are inside a verbose log, just accumulate lines until we
> >> > reach the
> >> > +    # stop marker.
> >> > +    if self.activeVerboseLog is not None:
> >> > +      self.handleVerboseLogLine(line)
> >> >        return
> >> >
> >> > -    line = line.strip()
> >> > -    if not line:
> >> > +    # Check for the test verbose log start marker.
> >> > +    m = self.kTestVerboseLogStartRE.match(line.strip())
> >> > +    if m:
> >> > +      self.activeVerboseLog = []
> >> > +      if m.group(1) != self.lastTestResult[1]:
> >> > +        # This is bogus, the verbose log test name doesn't match what
> >> > we
> >> > +        # expect. Just note it in the log but otherwise accumulate as
> >> > normal.
> >> > +        self.activeVerboseLog.append(
> >> > +          "error: verbose log output name didn't match expected test
> >> > name")
> >> >        return
> >> >
> >> > -    # Check for test failure logs.
> >> > -    m = self.kTestFailureLogStartRE.match(line)
> >> > -    if m and self.inFailure:
> >> > -      if m.group(1)==self.inFailure[0]:
> >> > -          self.inFailure[1].append(line)
> >> > -          self.inFailureContext = True
> >> > -          return
> >> > -      else:
> >> > -          msg = 'm[0]: %r\ninFailure[0]: %r' % (m.group(0),
> >> > self.inFailure[0])
> >> > -          raise ValueError, msg
> >> > -    # Otherwise expect a test status line.
> >> > -    m = self.kTestLineRE.match(line)
> >> > +    # Otherwise, if we had any previous test consider it finished.
> >> > +    #
> >> > +    # FIXME: This assumes that we will always have at least one line
> >> > following
> >> > +    # the last test result to properly record each test; we could fix
> >> > this if
> >> > +    # buildbot provided us a hook for when the log is done.
> >> > +    if self.lastTestResult:
> >> > +      self.testInfoFinished()
> >> > +
> >> > +    # Check for a new test status line.
> >> > +    m = self.kTestLineRE.match(line.strip())
> >> >      if m:
> >> > -      code, name = m.groups()
> >> > -      if self.inFailure and name != self.inFailure[0]:
> >> > -        name,log = self.inFailure
> >> > -        self.step.addCompleteLog(name.replace('/', '__'),
> >> > '\n'.join(log))
> >> > -        self.inFailure = None
> >> > -      if code in self.failingCodes:
> >> > -        self.inFailure = (name, [line])
> >> > +      # Remember the last test result and update the result counts.
> >> > +      self.lastTestResult = (code, name) = m.groups()
> >> >        self.resultCounts[code] = self.resultCounts.get(code, 0) + 1
> >> > +      return
> >> >
> >> >  class LitTestCommand(Test):
> >> >    resultNames = {'FAIL':'unexpected failures',
> >> > @@ -124,7 +163,23 @@ PASS: test-three (3 of 3)
> >> >  """)
> >> >
> >> >      self.assertEqual(obs.resultCounts, { 'FAIL' : 1, 'PASS' : 2 })
> >> > -    self.assertEqual(obs.step.logs, [('test-two', 'FAIL: test-two (2
> of
> >> > 3)')])
> >> > +    self.assertEqual(obs.step.logs, [('test-two', 'FAIL: test-two')])
> >> > +
> >> > +  def test_verbose_logs(self):
> >> > +    obs = self.parse_log("""
> >> > +FAIL: test-one (1 of 3)
> >> > +FAIL: test-two (2 of 3)
> >> > +**** TEST 'test-two' ****
> >> > +bla bla bla
> >> > +**********
> >> > +FAIL: test-three (3 of 3)
> >> > +""")
> >> > +
> >> > +    self.assertEqual(obs.resultCounts, { 'FAIL' : 3 })
> >> > +    self.assertEqual(obs.step.logs, [
> >> > +        ('test-one', 'FAIL: test-one'),
> >> > +        ('test-two', 'bla bla bla'),
> >> > +        ('test-three', 'FAIL: test-three')])
> >> >
> >> >  class TestCommand(unittest.TestCase):
> >> >    def parse_log(self, text):
> >>
> >> All in all these changes seem to have made this code much more
> >> complicated for cases I don't really understand & don't think I've
> >> ever seen come from lit.
> >
> >
> > These cases match what the old code handled and what my understanding of
> the
> > lit format is and how LNT uses it. The test cases I added are exactly
> things
> > that I want this code to be able to handle.
>
> It seems like a strange format, then.

The format seems reasonable enough to me. I view the current format as just
a list of test codes and results, with optionally associated verbose logs.
That's it, and that is the level I want the buildbot to understand it at.

Essentially any disparity
> between what the buildbot reports & what the underlying tools report
> seems like it's a bug/less-than-ideal-feature in the underlying tools
> we should fix (since it's in our power to do so & the fact that we're
> massaging/transforming the data when reporting to the buildbot seems
> to indicate that we think the original data wasn't in the right format
> to begin with).
>

Why would we log context for passing results? If we consider it not
> worthwhile to report up to the build results, why would it be
> worthwhile to report to users not using the buildbots?
>

I'm fine switching to this, I was just keeping compatibility with what
happened before. It wouldn't add any value at the moment, but if it really
bothers you feel free to drop that branch.

Why do we not log context for failing results? (this appears to only
> happen for LNT - does lit ever do this?) LNT clearly has them in other
> logs, it's just not surfacing it. I don't see why we would /want/ to
> handle these things in the Buildbot infrastructure (yes, I agree it
> might be a short term solution - but not the end state we'd be aiming
> for)
>

It's always ideal to provide context, but I don't see a reason to make the
generic format require it.

 - Daniel

> > While I appreciate your wanting to improve this code, the cases it
> handled
> > are used and just breaking them without understanding why they were
> there is
> > not cool.
>
> Agreed - hence why I asked about what changes I was making & requested
> code review (I discussed the general plan on the lab list, asked
> whether people preferred pre or post-commit review, etc). Sorry about
> the breakage. Didn't really get any /review/ just "hey, this broke
> stuff" and various committed changes without a lot of
> explanation/discussion.
>
> Thanks for adding the tests/test infrastructure.
>
> - David
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20130201/202bdc0a/attachment.html>