<html><head><meta http-equiv="Content-Type" content="text/html; charset=utf-8"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;" class="">This situation is somewhat complicated by the fact that Zachary - the only listed code owner for Windows support - hasn’t worked on lldb for quite a while now.  Various people have been helping with the Windows port, but I’m not sure that there’s someone taking overall responsibility for the Windows port.<div class=""><br class=""></div><div class="">Greg may have access to a Windows system, but neither Jason nor I work on Windows at all.  In fact, I don’t think anybody listed in the Code Owner’s file for lldb does much work on Windows.  For the health of that port, we probably do need someone to organize the effort and help sort out this sort of thing.</div><div class=""><br class=""></div><div class="">Anyway, looking at the current set of bot failures for this Windows bot, I saw three basic classes of failures (besides the build breaks).</div><div class=""><br class=""></div><div class="">1) Watchpoint Support:</div><div class=""><br class=""></div><div class="">TestWatchLocation.py wasn’t the only or even the most common Watchpoint failure in these test runs:  </div><div class=""><br class=""></div><div class="">For instance in:<div class=""><br class=""></div><div class=""><a href="https://lab.llvm.org/buildbot/#/builders/83/builds/13600" class="">https://lab.llvm.org/buildbot/#/builders/83/builds/13600</a></div><div class=""><a href="https://lab.llvm.org/buildbot/#/builders/83/builds/13543" class="">https://lab.llvm.org/buildbot/#/builders/83/builds/13543</a></div><div class=""><br class=""></div><div class="">The failing test is TestWatchpointMultipleThreads.py.</div><div class=""><br class=""></div><div class="">On:</div><div class=""><br class=""></div><div class=""><div class=""><a href="https://lab.llvm.org/buildbot/#/builders/83/builds/13579" class="">https://lab.llvm.org/buildbot/#/builders/83/builds/13579</a></div></div><div class=""><div class=""><a href="https://lab.llvm.org/buildbot/#/builders/83/builds/13576" class="">https://lab.llvm.org/buildbot/#/builders/83/builds/13576</a></div></div><div class=""><a href="https://lab.llvm.org/buildbot/#/builders/83/builds/13565" class="">https://lab.llvm.org/buildbot/#/builders/83/builds/13565</a></div><div class=""><a href="https://lab.llvm.org/buildbot/#/builders/83/builds/13538" class="">https://lab.llvm.org/buildbot/#/builders/83/builds/13538</a></div><div class=""><br class=""></div><div class=""><div class="">it’s TestSetWatchlocation.py</div></div><div class=""><br class=""></div><div class="">On:</div><div class=""><a href="https://lab.llvm.org/buildbot/#/builders/83/builds/13550" class="">https://lab.llvm.org/buildbot/#/builders/83/builds/13550</a></div><div class=""><a href="https://lab.llvm.org/buildbot/#/builders/83/builds/13508" class="">https://lab.llvm.org/buildbot/#/builders/83/builds/13508</a></div><div class=""><br class=""></div><div class="">It’s TestWatchLocationWithWatchSet.py</div><div class=""><br class=""></div><div class="">On:</div><div class=""><br class=""></div><div class=""><a href="https://lab.llvm.org/buildbot/#/builders/83/builds/13528" class="">https://lab.llvm.org/buildbot/#/builders/83/builds/13528</a></div><div class=""><br class=""></div><div class="">It’s TestTargetWatchAddress.py</div><div class=""><br class=""></div><div class="">These are all in one way or another failing because we set a watchpoint, and expected to hit it, and did not.  In the failing tests, we do verify that we got a valid watchpoint back.  We just “continue” expecting to hit it and don't.  The tests don’t seem to be doing anything suspicious that would cause inconsistent behavior, and they aren’t failing on other systems.  It sounds more like the way lldb-server for Windows implements watchpoint setting is flakey in some way.</div><div class=""><br class=""></div><div class="">So these really are “tests correctly showing flakey behavior in the underlying code”.  We could just skip all these watchpoint tests, but we already have 268-some odd tests that are marked as skipIfWindows, most with annotations that some behavior or other is flakey on Windows.  It is not great for the platform support to just keep adding to that count, but if nobody is available to dig into the Windows watchpoint code, we probably need to declare Watchpoint support “in a beta state” and turn off all the tests for it.  But that seems like a decision that should be made by someone with more direct responsibility for the Windows port.</div><div class=""><br class=""></div><div class="">Does our bot strategy cover how to deal with incomplete platform support on some particular platform?  Is the only choice really just turning off all the tests that are uncovering flaws in the underlying implementation?</div><div class=""><br class=""></div><div class="">2) Random mysterious failure:</div><div class=""><br class=""></div><div class="">I also saw one failure here:</div><div class=""><br class=""></div><div class=""><a href="https://lab.llvm.org/buildbot/#/builders/83/builds/13513" class="">https://lab.llvm.org/buildbot/#/builders/83/builds/13513</a></div><div class=""><br class=""></div><div class="">functionalities/load_after_attach/TestLoadAfterAttach.py</div><div class=""><br class=""></div><div class="">In that one, lldb sets a breakpoint, confirms that the breakpoint got a valid location, then continues and runs to completion w/o hitting the breakpoint.  Again, that test is quite straightforward, and it looks like the underlying implementation, not the test, is what is at fault.</div><div class=""><br class=""></div><div class="">3) lldb-server for Windows test failures:</div><div class=""><br class=""></div><div class="">In these runs:</div><div class=""><br class=""></div><div class=""><a href="https://lab.llvm.org/buildbot/#/builders/83/builds/13594" class="">https://lab.llvm.org/buildbot/#/builders/83/builds/13594</a></div><div class=""><a href="https://lab.llvm.org/buildbot/#/builders/83/builds/13580" class="">https://lab.llvm.org/buildbot/#/builders/83/builds/13580</a></div><div class=""><a href="https://lab.llvm.org/buildbot/#/builders/83/builds/13550" class="">https://lab.llvm.org/buildbot/#/builders/83/builds/13550</a></div><div class=""><a href="https://lab.llvm.org/buildbot/#/builders/83/builds/13535" class="">https://lab.llvm.org/buildbot/#/builders/83/builds/13535</a></div><div class=""><a href="https://lab.llvm.org/buildbot/#/builders/83/builds/13526" class="">https://lab.llvm.org/buildbot/#/builders/83/builds/13526</a></div><div class=""><a href="https://lab.llvm.org/buildbot/#/builders/83/builds/13525" class="">https://lab.llvm.org/buildbot/#/builders/83/builds/13525</a></div><div class=""><a href="https://lab.llvm.org/buildbot/#/builders/83/builds/13511" class="">https://lab.llvm.org/buildbot/#/builders/83/builds/13511</a></div><div class=""><a href="https://lab.llvm.org/buildbot/#/builders/83/builds/13498" class="">https://lab.llvm.org/buildbot/#/builders/83/builds/13498</a></div><div class=""><br class=""></div><div class="">The failure was in the Windows’ lldb-server implementation here:</div><div class=""><br class=""></div><div class="">tools/lldb-server/tests/./LLDBServerTests.exe/StandardStartupTest.TestStopReplyContainsThreadPcs</div><div class=""><br class=""></div><div class="">And there were a couple more lldb-server test fails:</div><div class=""><br class=""></div><div class=""><a href="https://lab.llvm.org/buildbot/#/builders/83/builds/13527" class="">https://lab.llvm.org/buildbot/#/builders/83/builds/13527</a></div><div class=""><a href="https://lab.llvm.org/buildbot/#/builders/83/builds/13524" class="">https://lab.llvm.org/buildbot/#/builders/83/builds/13524</a></div><div class=""><br class=""></div><div class="">Where the failure is:</div><div class=""><br class=""></div><div class="">tools/lldb-server/TestGdbRemoteExpeditedRegisters.py</div><div class=""><br class=""></div><div class="">MacOS doesn’t use lldb-server, so I am not particularly familiar with it, and didn’t look into these failures further.</div><div class=""><br class=""></div><div class="">Jim</div><div class=""><br class=""><div class=""><br class=""></div><div class=""><br class=""></div><div class=""><div><br class=""><blockquote type="cite" class=""><div class="">On Jan 10, 2022, at 3:33 PM, Philip Reames <<a href="mailto:listmail@philipreames.com" class="">listmail@philipreames.com</a>> wrote:</div><br class="Apple-interchange-newline"><div class="">
  
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" class="">
  
  <div class=""><p class="">+CC lldb code owners  <br class="">
    </p><p class="">This bot appears to have been restored to the primary
      buildmaster, but is failing something like 1 in 5 builds due to
      lldb tests which are flaky.</p><p class=""><a class="moz-txt-link-freetext" href="https://lab.llvm.org/buildbot/#/builders/83">https://lab.llvm.org/buildbot/#/builders/83</a></p><p class="">Specifically, this test is the one failing:</p>
    <pre class="log select-content"><span class="no-wrap log_o" data-linenumber-content="568"><span class="">commands/watchpoints/hello_watchlocation/TestWatchLocation.py</span></span></pre><p class="">Can someone with LLDB context please either a) address the cause
      of the flakiness or b) disable the test?<br class="">
    </p><p class="">Philip</p><p class="">p.s. Please restrict this sub-thread to the topic of stabilizing
      this bot.  Policy questions can be addressed in the other
      sub-threads to keep this vaguely understandable.  <br class="">
    </p>
    <div class="moz-cite-prefix">On 1/8/22 1:01 PM, Philip Reames via
      llvm-dev wrote:<br class="">
    </div>
    <blockquote type="cite" cite="mid:d12cd142-0113-ceb9-4e34-f0391eaacf84@philipreames.com" class="">In
      this particular example, we appear to have a bunch of flaky lldb
      tests.  I personally know absolutely nothing about lldb.  I have
      no idea whether the tests are badly designed, the system they're
      being run on isn't yet supported by lldb, or if there's some
      recent code bug introduced which causes the failure.  "Someone"
      needs to take the responsibility of figuring that out, and in the
      meantime spaming developers with inactionable failure notices
      seems undesirable.  </blockquote>
  </div>

</div></blockquote></div><br class=""></div></div></div></body></html>