[llvm-dev] [EXTERNAL] Responsibilities of a buildbot owner

Fri Jan 14 15:55:53 PST 2022

> On Jan 14, 2022, at 3:27 PM, Stella Stamenova <stilis at microsoft.com> wrote:
> 
> Thanks Omair!
>  
> I’ll wait for your change to go in and we can evaluate what else might need to happen afterwards.
>  
> I’ve been running some local tests with `LLDB_USE_LLDB_SERVER` set to 1 and that appears to have made them more stable locally. I think we should consider defaulting to using lldb-server on Windows instead of the other way around. @Greg Clayton <mailto:clayborg at gmail.com> do you happen to know why it defaults to not using lldb-server?

I do not but the golden path that we really want people to follow is to use the lldb-server to debug things. This allows remote debugging to work well in all cases instead of being just some avenue that no one tests.

Benefits of using lldb-server:
- Mac and linux have been using it since the beginning and the ProcessGDBRemote is the best supported process plug-in as it has see many different GDB remote clients and served multiple architectures really well
- We can get a packet log for tests to see what actually went wrong. When using ProcessWindows, unless we have logging on every API call and event that is generated, we have no hope of figuring any issues out. Anyone can enable a log with “log enable -f /tmp/packets.txt gdb-remote packets” and send that to someone to help figure out issues
- Dynamic register information is transferred and allows the logs to be even more useful since we know all of the registers from the register context detection packets
- Makes remote debugging possible and it works really well.

So I would highly suggest to switch over to using the lldb-server permanently if possible and I would like to see the ProcessWindows class go away in the future. The main reason is we will be able to see what is going on by checking the lldb-server logs when we have a flaky tests. I would be happy to help figure out issues on windows if I can see the packet log for a flaky test where we have one log that passes the test and one that fails it. I am quite good at looking at these logs and figuring out what is going wrong. With ProcessWindows and absolutely no logging, we have no hope of figuring any buildbot issue out unless we can reliably reproduce the issue. Also, we have a TON of testing on the lldb-server debugging since 99% of all LLDB users use it (wither lldb-server or debugserver for Darwin (macOS, iOS, tvOS, watchOS)). 

So a big vote to enable this, and if all goes well, remove the ProcessWindows class and always use lldb-server from here on out if all goes well
>  
> Thanks,
> -Stella
>  
> From: Omair Javaid <omair.javaid at linaro.org> 
> Sent: Friday, January 14, 2022 3:12 PM
> To: Stella Stamenova <stilis at microsoft.com>
> Cc: Pavel Labath <pavel at labath.sk>; Galina Kistanova <gkistanova at gmail.com>; Jonas Devlieghere <jonas at devlieghere.com>; Jim Ingham <jingham at apple.com>; llvm-dev <llvm-dev at lists.llvm.org>
> Subject: [EXTERNAL] Re: [llvm-dev] Responsibilities of a buildbot owner
>  
> Hi Stella,
>  
> This is in reference to my email on lldb-dev about setting up a LLDB window on Arm64 buildbot. We are currently working on setting up a Arm64 bot that will run only unit-tests and shell-tests. However in future we are going to be taking up LLDB on Windows Arm64 maintenance and hope to run a full featured testsuite on our buildbots. Meanwhile, as python API support is a very important LLDB feature, not running API tests will result in an incremental pile of windows specific failures which will increase engineering effort required for stabilising LLDB on windows. I have suggested reducing the number of parallel API tests on windows to see if it reduces the amount of noise generated by flaky tests. 
>  
> https://reviews.llvm.org/D117363 <https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Freviews.llvm.org%2FD117363&data=04%7C01%7CSTILIS%40microsoft.com%7C450b02e0852e4b70505208d9d7b3604f%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637777987701383091%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=d9jqPl19KWPD6zwQCRRHSOhbpw9uh0DvSjCtBsJCt2s%3D&reserved=0>
>  
> In the case it doesnt work, I'll take up the ownership of Windows x64 buildbot as well and try to keep noise reduced similar to what I do for LInux Arm/Arm64 LLDB bots.
>  
> Thanks!
>  
> Omair Javaid
> www.linaro.org <https://nam06.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.linaro.org%2F&data=04%7C01%7CSTILIS%40microsoft.com%7C450b02e0852e4b70505208d9d7b3604f%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637777987701383091%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=%2FUJCo3dWwf7wjtxSs%2F%2FFgu0rJ0HGpmU%2BkRifS9xehxI%3D&reserved=0>
>  
> On Fri, 14 Jan 2022 at 09:38, Stella Stamenova via llvm-dev <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote:
> I had a chat with Jonas earlier today and one of the things that came out was that we actually have three separate suites of tests in lldb:
>         - shell
>         - unit
>         - api
> 
> The category that causes the most pain in general, including on the Windows lldb bot, is the API tests. The shell tests are very stable and so are all (but one) of the unit tests.
> 
> Since, as Pavel pointed out, there's not a very active community for lldb on Windows, one thing we could do is run only the shell and unit test suites on the Windows buildbot and drop the API tests. This would allow us to prevent complete bit rot by providing relatively good coverage while at the same time removing the most unstable tests from the buildbot. Then we could dispense with having to disable individual API tests when they show instability on Windows.
> 
> I drafted a patch that would do that (with the assumption that everyone would be on board):
> https://reviews.llvm.org/D117267 <https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Freviews.llvm.org%2FD117267&data=04%7C01%7CSTILIS%40microsoft.com%7C450b02e0852e4b70505208d9d7b3604f%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637777987701383091%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=Qphcrw0zq4k%2BciPTaj4wSlBQ%2BQHtIN3qbh2PkTA7TWg%3D&reserved=0>
> 
> Let me know if you disagree with this course of action or have any other concerns.
> 
> Thanks,
> -Stella
> 
> -----Original Message-----
> From: llvm-dev <llvm-dev-bounces at lists.llvm.org <mailto:llvm-dev-bounces at lists.llvm.org>> On Behalf Of Stella Stamenova via llvm-dev
> Sent: Wednesday, January 12, 2022 9:07 AM
> To: Greg Clayton <clayborg at gmail.com <mailto:clayborg at gmail.com>>; Pavel Labath <pavel at labath.sk <mailto:pavel at labath.sk>>
> Cc: Jim Ingham <jingham at apple.com <mailto:jingham at apple.com>>; llvm-dev <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>>
> Subject: Re: [llvm-dev] [EXTERNAL] Re: Responsibilities of a buildbot owner
> 
> >  Can someone verify if we are testing with ProcessWindows or lldb-server on the build bot?
> 
> Since I didn't set LLDB_USE_LLDB_SERVER on the buildbot itself and this is not in the zorg configuration, the buildbot is using ProcessWindows.
> 
> I've never tried setting LLDB_USE_LLDB_SERVER to on when running the tests, so I am not sure what to expect from the results though. If I have time, I'll try it out locally this week to see what happens.
> 
> -----Original Message-----
> From: Greg Clayton <clayborg at gmail.com <mailto:clayborg at gmail.com>> 
> Sent: Tuesday, January 11, 2022 4:42 PM
> To: Pavel Labath <pavel at labath.sk <mailto:pavel at labath.sk>>; Stella Stamenova <stilis at microsoft.com <mailto:stilis at microsoft.com>>
> Cc: Philip Reames <listmail at philipreames.com <mailto:listmail at philipreames.com>>; Jim Ingham <jingham at apple.com <mailto:jingham at apple.com>>; llvm-dev <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>>
> Subject: Re: [llvm-dev] [EXTERNAL] Re: Responsibilities of a buildbot owner
> 
> Does windows use lldb-server by default or does it use ProcessWindows? ProcessWindows is the native process debugger, and lldb-server is the way we want debugging to work. If we look at ProcessWindows.cpp:
> 
> static bool ShouldUseLLDBServer() {
>   llvm::StringRef use_lldb_server = ::getenv("LLDB_USE_LLDB_SERVER");
>   return use_lldb_server.equals_insensitive("on") ||
>          use_lldb_server.equals_insensitive("yes") ||
>          use_lldb_server.equals_insensitive("1") ||
>          use_lldb_server.equals_insensitive("true");
> }
> 
> void ProcessWindows::Initialize() {
>   if (!ShouldUseLLDBServer()) {
>     static llvm::once_flag g_once_flag;
> 
>     llvm::call_once(g_once_flag, []() {
>       PluginManager::RegisterPlugin(GetPluginNameStatic(),
>                                     GetPluginDescriptionStatic(),
>                                     CreateInstance);
>     });
>   }
> }
> 
> 
> 
> We can see it is enabled if LLDB_USE_LLDB_SERVER is set the "on", "yes", "1", or "true". If this is not set then this is using the built in ProcessWindows.cpp native process plug-in which I believe was never fully fleshed out and had issues. 
> 
> Can someone verify if we are testing with ProcessWindows or lldb-server on the build bot?
> 
> 
> > On Jan 11, 2022, at 10:31 AM, Pavel Labath via llvm-dev <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote:
> > 
> > On 11/01/2022 18:59, Stella Stamenova wrote:
> >> The windows lldb bot is running on a Hyper-V virtual machine, so it would make sense that if watchpoints don't work correctly in virtual environments they would be failing there. On the rare occasion I've had to run these tests locally, I have also seen them fail though, so that's not the only source of issues.
> >> Since I disabled the couple of tests yesterday, there's only one watchpoint test that is still failing randomly. One option would be to disable just this test and let the remaining few watchpoint tests continue to run on Windows (I prefer this option since some tests would continue to run). Alternatively, all the watchpoint tests can be skipped via the category flag, but in that case, I'd like us to undo the individual skips.
> > 
> > For better or worse, you're currently the most (only?) interested person in keeping windows host support working, so I think you can manage the windows skips/fails in any way you see fit. The rest of us are mostly interested in having green builds. :)
> > 
> > Hyper-V is _not_ among the virtualization systems I've tried using with lldb, so I cannot conclusively say anything about it (though I still have my doubts).
> > 
> >> I did notice while going through the watchpoint tests to see what is still enabled on Windows, that the same watchpoint tests that are disabled/failing on Windows are disabled on multiple other platforms as well. The tests passing on Windows are also the ones that are not disabled on other platforms. A third option would be to add a separate category for the watchpoint tests that don't run correctly everywhere and use that to disable them instead. This would be a more generic way to disable the tests instead of adding multiple `skipIf` statements to each test.
> > 
> > On non-x86 architectures, watchpoints tend to be available only on special (developer) hardware or similar (x86 is the outlier in having universal support), which is why these tests tend to accumulate various annotations. However, I don't think we need to solve this problem (how to skip the tests "nicely") here...
> > 
> > pl
> > _______________________________________________
> > LLVM Developers mailing list
> > llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
> > https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.llvm.org%2Fcgi-bin%2Fmailman%2Flistinfo%2Fllvm-dev&data=04%7C01%7Cstilis%40microsoft.com%7Ceaf5b1164b4d47cd7d3908d9d5edf20a%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637776040213456529%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=i4%2FHWKyjKWdXm5PE6dj339TNuFIs5xMNZr3yuFzMoVA%3D&reserved=0 <https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.llvm.org%2Fcgi-bin%2Fmailman%2Flistinfo%2Fllvm-dev&data=04%7C01%7CSTILIS%40microsoft.com%7C450b02e0852e4b70505208d9d7b3604f%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637777987701383091%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=tulIVGSJYftPZPv7YK3VLJnlrP6Lnsiydk2RTzUDpBk%3D&reserved=0>
> 
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.llvm.org%2Fcgi-bin%2Fmailman%2Flistinfo%2Fllvm-dev&data=04%7C01%7Cstilis%40microsoft.com%7Ceaf5b1164b4d47cd7d3908d9d5edf20a%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637776040213456529%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=i4%2FHWKyjKWdXm5PE6dj339TNuFIs5xMNZr3yuFzMoVA%3D&reserved=0 <https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.llvm.org%2Fcgi-bin%2Fmailman%2Flistinfo%2Fllvm-dev&data=04%7C01%7CSTILIS%40microsoft.com%7C450b02e0852e4b70505208d9d7b3604f%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637777987701383091%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=tulIVGSJYftPZPv7YK3VLJnlrP6Lnsiydk2RTzUDpBk%3D&reserved=0>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev <https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.llvm.org%2Fcgi-bin%2Fmailman%2Flistinfo%2Fllvm-dev&data=04%7C01%7CSTILIS%40microsoft.com%7C450b02e0852e4b70505208d9d7b3604f%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637777987701383091%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=tulIVGSJYftPZPv7YK3VLJnlrP6Lnsiydk2RTzUDpBk%3D&reserved=0>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20220114/a1816900/attachment-0001.html>