[llvm-dev] [EXTERNAL] Re: Responsibilities of a buildbot owner

Tue Jan 11 09:59:29 PST 2022

The windows lldb bot is running on a Hyper-V virtual machine, so it would make sense that if watchpoints don't work correctly in virtual environments they would be failing there. On the rare occasion I've had to run these tests locally, I have also seen them fail though, so that's not the only source of issues.

Since I disabled the couple of tests yesterday, there's only one watchpoint test that is still failing randomly. One option would be to disable just this test and let the remaining few watchpoint tests continue to run on Windows (I prefer this option since some tests would continue to run). Alternatively, all the watchpoint tests can be skipped via the category flag, but in that case, I'd like us to undo the individual skips.

I did notice while going through the watchpoint tests to see what is still enabled on Windows, that the same watchpoint tests that are disabled/failing on Windows are disabled on multiple other platforms as well. The tests passing on Windows are also the ones that are not disabled on other platforms. A third option would be to add a separate category for the watchpoint tests that don't run correctly everywhere and use that to disable them instead. This would be a more generic way to disable the tests instead of adding multiple `skipIf` statements to each test.

Thanks,
-Stella

-----Original Message-----
From: Pavel Labath <pavel at labath.sk> 
Sent: Tuesday, January 11, 2022 9:46 AM
To: Philip Reames <listmail at philipreames.com>; Stella Stamenova <stilis at microsoft.com>; Jim Ingham <jingham at apple.com>
Cc: llvm-dev <llvm-dev at lists.llvm.org>; zturner at google.com
Subject: Re: [llvm-dev] [EXTERNAL] Re: Responsibilities of a buildbot owner

On 11/01/2022 18:22, Philip Reames wrote:
> 
> On 1/11/22 3:32 AM, Pavel Labath wrote:
>> I am afraid I too have to say that I believe the real problem here is 
>> the lack active developers with interest in/commitment to the windows 
>> port of lldb. While I appreciate having Stella's windows buildbot 
>> around, and it prevents windows from bitrotting completely, it would 
>> take a much more active involvement to resolve the multitude of 
>> systemic issues affecting windows support. Like, if we tried to apply 
>> the current llvm support policy guidelines to the windows (host-side, 
>> at least) support code, I don't think it would even meet the criteria 
>> for inclusion in the peripheral tier (active sub-community).
>>
>> Now for something slightly more constructive:
>>
>> While I am not familiar with the windows-specific parts of the 
>> watchpoint code, I think I can say without exaggerating that I have a
>> *lot* of experience in fixing flaky tests. That experience tells me 
>> that flaky watchpoint tests are often/usually caused by factors 
>> outside lldb.  (due to watchpoints being a global, scarce, hardware 
>> resource). Virtualization is particularly tricky here -- every 
>> virtualization technology that I've tried has had (at some point in 
>> time at least) a watchpoint-related bug. The problem described here 
>> sounds a lot like the issue I observed on Google Compute Engine, 
>> which could also miss some watchpoints "randomly". So, if this bot is 
>> running in any kind of a virtualized environment, the first thing I'd 
>> do is check whether the issue happens on physical hardware.
>>
>> Relatedly to that, I also want to mention that we also have the 
>> ability to skip categories of tests in lldb. All the watchpoint tests 
>> are (should be) annotated by the watchpoint category, and so you can 
>> easily skip all of them, either by hard-disabling the category for 
>> windows in the source code (if this is an lldb issue) or externally 
>> through the buildbot config (if this is due to the bot environment => 
>> LLDB_TEST_USER_ARGS="--skip-category watchpoint").
> 
> Would it be reasonable to recommend that all of our windows bots 
> testing lldb add this flag?  Or maybe even check something in so that 
> all builds default to not running these tests on Windows? The former 
> would make sense if we primarily think this is virtualization related, 
> the later if we think it's more likely a code problem.
> 

If that question was meant for me, then my answer is yes. I think those tests should be disabled regardless of the cause. I actually tried to say the same thing, but I may not have succeeded in getting it across. 
Stella, can you share what kind of environment is that bot running in?

> I noticed last night that we have a couple of other windows bots which 
> seem to be hitting the same false positives.  Much lower frequencies, 
> but it does seem this is not specific to the particular bot.
Hmm.. do you have a link to those bots or something? Stella's bot is the only windows (lldb) bot I am aware of and I'd be surprised if there were more of them.