[llvm-dev] [EXTERNAL] Re: Responsibilities of a buildbot owner

Tue Jan 11 09:45:43 PST 2022

On 11/01/2022 18:22, Philip Reames wrote:
> 
> On 1/11/22 3:32 AM, Pavel Labath wrote:
>> I am afraid I too have to say that I believe the real problem here is 
>> the lack active developers with interest in/commitment to the windows 
>> port of lldb. While I appreciate having Stella's windows buildbot 
>> around, and it prevents windows from bitrotting completely, it would 
>> take a much more active involvement to resolve the multitude of 
>> systemic issues affecting windows support. Like, if we tried to apply 
>> the current llvm support policy guidelines to the windows (host-side, 
>> at least) support code, I don't think it would even meet the criteria 
>> for inclusion in the peripheral tier (active sub-community).
>>
>> Now for something slightly more constructive:
>>
>> While I am not familiar with the windows-specific parts of the 
>> watchpoint code, I think I can say without exaggerating that I have a 
>> *lot* of experience in fixing flaky tests. That experience tells me 
>> that flaky watchpoint tests are often/usually caused by factors 
>> outside lldb.  (due to watchpoints being a global, scarce, hardware 
>> resource). Virtualization is particularly tricky here -- every 
>> virtualization technology that I've tried has had (at some point in 
>> time at least) a watchpoint-related bug. The problem described here 
>> sounds a lot like the issue I observed on Google Compute Engine, which 
>> could also miss some watchpoints "randomly". So, if this bot is 
>> running in any kind of a virtualized environment, the first thing I'd 
>> do is check whether the issue happens on physical hardware.
>>
>> Relatedly to that, I also want to mention that we also have the 
>> ability to skip categories of tests in lldb. All the watchpoint tests 
>> are (should be) annotated by the watchpoint category, and so you can 
>> easily skip all of them, either by hard-disabling the category for 
>> windows in the source code (if this is an lldb issue) or externally 
>> through the buildbot config (if this is due to the bot environment => 
>> LLDB_TEST_USER_ARGS="--skip-category watchpoint").
> 
> Would it be reasonable to recommend that all of our windows bots testing 
> lldb add this flag?  Or maybe even check something in so that all builds 
> default to not running these tests on Windows? The former would make 
> sense if we primarily think this is virtualization related, the later if 
> we think it's more likely a code problem.
> 

If that question was meant for me, then my answer is yes. I think those 
tests should be disabled regardless of the cause. I actually tried to 
say the same thing, but I may not have succeeded in getting it across. 
Stella, can you share what kind of environment is that bot running in?

> I noticed last night that we have a couple of other windows bots which 
> seem to be hitting the same false positives.  Much lower frequencies, 
> but it does seem this is not specific to the particular bot.
Hmm.. do you have a link to those bots or something? Stella's bot is the 
only windows (lldb) bot I am aware of and I'd be surprised if there were 
more of them.