<div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote">On Mon, Dec 14, 2015 at 6:58 AM, Pavel Labath <span dir="ltr"><<a href="mailto:labath@google.com" target="_blank">labath@google.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Hi,<br>

<br>

thanks a lot for fixing the timeout issue on such a short notice.</blockquote><div><br></div><div>Sure thing.  I didn't mind at all once I understood the context in which they were being used.</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"> I<br>

didn't think I'd find myself defending them, as I remember being quite<br>

upset when they went in, but they have proven useful in stabilising<br>

the build bots, </blockquote><div><br></div><div>Yes, I understand.  I am definitely not a fan of tests that sometimes give useful signals (tests that don't lock up), but sometimes lock up (and therefore don't give useful signals). But since the lock-up is catchable via the timeout, and since we can figure that out definitively, this isn't quite so bad as it seems at first blush.</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">and I think it's likely you may need them as well.<br></blockquote><div><br></div><div>I am hopeful the test rerun mechanism generally handles it, but that'll only be the case if the timeout is triggered by heavy load.  Otherwise, the rerun phase will be no different.  Time will tell :-)</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

I'll try to now add a nicer way to expect timeouts so that we don't<br>

have the hack in the new runner as well. I'll add a new message, like<br>

you did for the flakey decorator.</blockquote><div><br></div><div>Feel free to mimic what I did with the flakey decorator sending a new test event that can additively apply the "'expected timeout" marking to the test method run.  In fact, if you get the decorator set up and put in a </div><div><br></div><div># TODO @tfiala add test event code here</div><div><br></div><div>I can get the rest of it in.</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

<br>

I'm a bit uneasy about adding another kind of a decorator though. What<br>

would you (and anyone else reading this) think about adding this<br>

behavior to the existing XFAIL decorators?<br>

This way, "timeout" would become just another way in which a test can<br>

"fail", and any test marked with an XFAIL decorator would be eligible<br>

for this treatment.<br></blockquote><div><br></div><div>I think I'd be okay with that.  I like reducing the number of ways we support failure decoration, so this seems reasonable.</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

<br>

We would lose the ability to individually expect "failures" and<br>

"timeouts", but I don't think that is really necessary, and I think it<br>

will be worth the extra maintainability we get from the fact of having<br>

fewer test decorators.<br>

<br></blockquote><div><br></div><div>OTOH, the piece we then lose is the ability to have an XFAIL mean "Hey this test really should fail, we haven't implemented feature XYZ (correctly or otherwise), so this better fail."  In that semantic meaning, an unexpected success would truly be an actionable signal --- either the test is now passing because the feature now works (actionable signal option A: the XFAIL should come off after verifying), or or the test is passing because it is not testing what it thought it was, and the test needs to be modified to more tightly bound the expected fail condition (actionable item option B).</div><div><br></div><div>So it eliminates the definiteness of an XFAIL ideally meaning "this really should fail," turning it into "it is permissible for this to fail."</div><div><br></div><div>All that said, our Python test suite is so far away from that ideal right now.  The highest level output of our test suite that I care about is "if tests run green, this is a good build", and if "tests run red, this is a bad build."  I don't see the timeout being rolled into XFAIL as hurting that.  It seems reasonable to roll them together at this time.  And the test output will list and count the timeouts.</div><div><br></div><div>So I'd be okay with that at this time in the sake of simplifying markup for tests.</div><div><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

What do you think?<br>

<span class="HOEnZb"><font color="#888888"><br>

pl<br>

</font></span><div class="HOEnZb"><div class="h5"><br>

<br>

On 11 December 2015 at 17:54, Todd Fiala <<a href="mailto:todd.fiala@gmail.com">todd.fiala@gmail.com</a>> wrote:<br>

> Merging threads.<br>

><br>

>> The concept is not there to protect against timeouts, which are caused<br>

> by processes being too slow, for these we have been increasing<br>

> timeouts where necessary.<br>

><br>

> Okay, I see.  If that's the intent, then expected timeout sounds reasonable.<br>

> (My abhorrence was against the idea of using that as a replacement for<br>

> increasing a timeout that was too short under load).<br>

><br>

> I would go with your original approach (the marking as expected timeout).<br>

> We can either have that generate a new event (much like a change I'm about<br>

> to put in that has flakey tests send and event indicating that they are<br>

> eligible for rerun) or annotate the start message.  FWIW, the startTest()<br>

> call on the LLDBTestResults gets called before decorators have a chance to<br>

> execute, which is why I'm going with the 'send an enabling event' approach.<br>

> (I'll be checking that in shortly here, like when I'm done writing this<br>

> email, so you'll see what I did there).<br>

><br>

> On Fri, Dec 11, 2015 at 9:41 AM, Todd Fiala <<a href="mailto:todd.fiala@gmail.com">todd.fiala@gmail.com</a>> wrote:<br>

>><br>

>><br>

>><br>

>> On Fri, Dec 11, 2015 at 3:26 AM, Pavel Labath <<a href="mailto:labath@google.com">labath@google.com</a>> wrote:<br>

>>><br>

>>> Todd, I've had to disable the new result formatter as it was not<br>

>>> working with the expected timeout logic we have for the old one. The<br>

>>> old XTIMEOUT code is a massive hack and I will be extremely glad when<br>

>>> we get rid of it, but we can't keep our buildbot red until then, so<br>

>>> I've switched it off.<br>

>>><br>

>><br>

>> Ah, sorry my comments on the check-in precede me reading this.  Glad you<br>

>> see this as a hack :-)<br>

>><br>

>> No worries on shutting it off.  I can get the expected timeout as<br>

>> currently written working with the updated summary results.<br>

>><br>

>>><br>

>>> I am ready to start working on this, but I wanted to run this idea<br>

>>> here first. I thought we could have a test annotation like:<br>

>>> @expectedTimeout(oslist=["linux"], ...)<br>

>>><br>

>>> Then, when the child runner would encounter this annotation, it would<br>

>>> set a flag in the "test is starting" message indicating that this test<br>

>>> may time out. Then if the test really times out, the parent would know<br>

>>> about this, and it could avoid flagging the test as error.<br>

>>><br>

>><br>

>> Yes, the idea seems reasonable.  The actual implementation will end up<br>

>> being slightly different as the ResultsFormatter will receive the test start<br>

>> event (where the timeout is expected comes from), whereas the reporter of<br>

>> the timeout (the test worker) will not know anything about that data.  It<br>

>> will still generate the timeout, but then the ResultsFormatter can deal with<br>

>> transforming this into the right event when a timeout is "okay".<br>

>><br>

>>><br>

>>> Alternatively, if we want to avoid the proliferation test result<br>

>>> states, we could key this off the standard @expectedFailure<br>

>>> annotation, then a "time out" would become just another way it which a<br>

>>> test can fail, and XTIMEOUT would become XFAIL.<br>

>>><br>

>>> What do you think ?<br>

>>><br>

>><br>

>> Even though the above would work, if the issue here ultimately is that a<br>

>> larger timeout is needed, we can avoid all this by increasing the timeout.<br>

>> Probably more effective, though, is going to be running it in the follow-up,<br>

>> low-load, single worker pass, where presumably we would not hit the timeout.<br>

>> If you think that would work, I'd say:<br>

>><br>

>> (1) short term (like in the next hour or so), I get the expected timeout<br>

>> working in the summary results.<br>

>><br>

>> (2) longer term (like by end of weekend or maybe Monday at worst), we have<br>

>> the second pass test run at lower load (i.e. single worker thread), which<br>

>> should prevent these things from timing out in the first place.<br>

>><br>

>> If the analysis of the cause of the timeout is incorrect, then really<br>

>> we'll want to do your initial proposal in the earlier paragraphs, though.<br>

>><br>

>> What do you think about any of that?<br>

>><br>

>><br>

>><br>

>>><br>

>>> pl<br>

>>><br>

>>> PS: I am pretty new to this part of code, so any pointers you have<br>

>>> towards implementing this would be extremely helpful.<br>

>>><br>

>>><br>

>>><br>

>>> On 10 December 2015 at 23:20, Todd Fiala <<a href="mailto:todd.fiala@gmail.com">todd.fiala@gmail.com</a>> wrote:<br>

>>> > Checked this in as r255310.  Let me know if you find any issues with<br>

>>> > that,<br>

>>> > Tamas.<br>

>>> ><br>

>>> > You will need '-v' to enable it.  Otherwise, it will just print the<br>

>>> > method<br>

>>> > name.<br>

>>> ><br>

>>> > -Todd<br>

>>> ><br>

>>> > On Thu, Dec 10, 2015 at 2:39 PM, Todd Fiala <<a href="mailto:todd.fiala@gmail.com">todd.fiala@gmail.com</a>><br>

>>> > wrote:<br>

>>> >><br>

>>> >> Sure, I can do that.<br>

>>> >><br>

>>> >> Tamas, okay to give more detail on -v?  I'll give it a shot to see<br>

>>> >> what<br>

>>> >> else comes out if we do that.<br>

>>> >><br>

>>> >> -Todd<br>

>>> >><br>

>>> >> On Thu, Dec 10, 2015 at 12:58 PM, Zachary Turner <<a href="mailto:zturner@google.com">zturner@google.com</a>><br>

>>> >> wrote:<br>

>>> >>><br>

>>> >>><br>

>>> >>><br>

>>> >>> On Thu, Dec 10, 2015 at 12:54 PM Todd Fiala <<a href="mailto:todd.fiala@gmail.com">todd.fiala@gmail.com</a>><br>

>>> >>> wrote:<br>

>>> >>>><br>

>>> >>>> Hi Tamas,<br>

>>> >>>><br>

>>> >>>><br>

>>> >>>><br>

>>> >>>> On Thu, Dec 10, 2015 at 2:52 AM, Tamas Berghammer<br>

>>> >>>> <<a href="mailto:tberghammer@google.com">tberghammer@google.com</a>> wrote:<br>

>>> >>>>><br>

>>> >>>>> HI Todd,<br>

>>> >>>>><br>

>>> >>>>> You changed the way the test failure list is printed in a way that<br>

>>> >>>>> now<br>

>>> >>>>> we only print the name of the test function failing with the name<br>

>>> >>>>> of the<br>

>>> >>>>> test file in parenthesis. Can we add back the name of the test<br>

>>> >>>>> class to this<br>

>>> >>>>> list?<br>

>>> >>>><br>

>>> >>>><br>

>>> >>>> Sure.  I originally planned to have that in there but there was some<br>

>>> >>>> discussion about it being too much info.  I'm happy to add that<br>

>>> >>>> back.<br>

>>> >>><br>

>>> >>> Can we have it tied to verbosity level?  We have -t and -v, maybe one<br>

>>> >>> of<br>

>>> >>> those could trigger more detail in the summary view.<br>

>>> >><br>

>>> >><br>

>>> >><br>

>>> >><br>

>>> >> --<br>

>>> >> -Todd<br>

>>> ><br>

>>> ><br>

>>> ><br>

>>> ><br>

>>> > --<br>

>>> > -Todd<br>

>><br>

>><br>

>><br>

>><br>

>> --<br>

>> -Todd<br>

><br>

><br>

><br>

><br>

> --<br>

> -Todd<br>

</div></div></blockquote></div><br><br clear="all"><div><br></div>-- <br><div class="gmail_signature"><div dir="ltr">-Todd</div></div>

</div></div>