[lldb-dev] [RFC] Improving protocol-level compatibility between LLDB and GDB

Thu Apr 22 00:03:03 PDT 2021

> On Apr 20, 2021, at 11:39 PM, Pavel Labath via lldb-dev <lldb-dev at lists.llvm.org> wrote:
> 
> I am very happy to see this effort and I fully encourage it.

Completely agree.  Thanks for Cc:'ing me Pavel, I hadn't seen Michał's thread.

> 
> On 20/04/2021 09:13, Michał Górny via lldb-dev wrote:
>> On Mon, 2021-04-19 at 16:29 -0700, Greg Clayton wrote:
>>>> I think the first blocker towards this project are existing
>>>> implementation bugs in LLDB. For example, the vFile implementation is
>>>> documented as using incorrect data encoding and open flags. This is not
>>>> something that can be trivially fixed without breaking compatibility
>>>> between different versions of LLDB.
>>> 
>>> We should just fix this bug in LLDB in both LLDB's logic and lldb-server IMHO. We typically distribute both "lldb" and "lldb-server" together so this shouldn't be a huge problem.
>> Hmm, I've focused on this because I recall hearing that OSX users
>> sometimes run new client against system server... but now I realized
>> this isn't relevant to LLGS ;-).  Still, I'm happy to do things
>> the right way if people feel like it's needed, or the easy way if it's
>> not.
> 
> The vFile packets are, used in the "platform" mode of the connection (which, btw, is also something that gdb does not have), and that is implemented by lldb-server on all hosts (although I think apple may have some custom platform implementations as well). In any case though, changing flag values on the client will affect all servers that it communicates with, regardless of the platform.
> 
> At one point, Jason cared enough about this to add a warning about not changing these constants to the code. I'd suggest checking with him whether this is still relevant.
> 
> Or just going with your proposed solution, which sounds perfectly reasonable to me....

The main backwards compatibility issue for Apple is that lldb needs to talk to old debugservers on iOS devices, where debugserver can't be updated.  I know of three protocol bugs we have today:

vFile:open flags
vFile:pread/pwrite base   https://bugs.llvm.org/show_bug.cgi?id=47820
A packet base   https://bugs.llvm.org/show_bug.cgi?id=42471

debugserver doesn't implement vFile packets.  So for those, we only need to worry about lldb/lldb-server/lldb-platform.

lldb-platform is a freestanding platform packets stub I wrote for Darwin systems a while back.  Real smol, it doesn't link to/use any llvm/lldb code.  I never upstreamed it because it doesn't really fit in with llvm/lldb projects in any way and it's not super interesting, it is very smol and simple.  I was tired of tracking down complicated bugs and wanted easier bugs.  It implements the vFile packets; it only does the platform packets and runs debugserver for everything else.

Technically a modern lldb could need to communicate with an old lldb-platform, but it's much more of a corner case and I'm not super worried about it, we can deal with that inside Apple (that is, I can be responsible for worrying about it.)

For vFile:open and vFile:pread/pwrite, I say we just change them in lldb/lldb-server and it's up to me to change them in lldb-platform at the same time.

For the A packet, debugserver is using base 10,

    errno = 0;
    arglen = strtoul(buf, &c, 10);
    if (errno != 0 && arglen == 0) {
      return HandlePacket_ILLFORMED(__FILE__, __LINE__, p,
                                    "arglen not a number on 'A' pkt");
    }
[..]
    errno = 0;
    argnum = strtoul(buf, &c, 10);
    if (errno != 0 && argnum == 0) {
      return HandlePacket_ILLFORMED(__FILE__, __LINE__, p,
                                    "argnum not a number on 'A' pkt");
    }

as does lldb,

    packet.PutChar('A');
    for (size_t i = 0, n = argv.size(); i < n; ++i) {
      arg = argv[i];
      const int arg_len = strlen(arg);
      if (i > 0)
        packet.PutChar(',');
      packet.Printf("%i,%i,", arg_len * 2, (int)i);
      packet.PutBytesAsRawHex8(arg, arg_len);

and lldb-server,

    // Decode the decimal argument string length. This length is the number of
    // hex nibbles in the argument string value.
    const uint32_t arg_len = packet.GetU32(UINT32_MAX);
    if (arg_len == UINT32_MAX)
      success = false;
    else {
      // Make sure the argument hex string length is followed by a comma
      if (packet.GetChar() != ',')
        success = false;
      else {
        // Decode the argument index. We ignore this really because who would
        // really send down the arguments in a random order???
        const uint32_t arg_idx = packet.GetU32(UINT32_MAX);

uint32_t StringExtractor::GetU32(uint32_t fail_value, int base) {
  if (m_index < m_packet.size()) {
    char *end = nullptr;
    const char *start = m_packet.c_str();
    const char *cstr = start + m_index;
    uint32_t result = static_cast<uint32_t>(::strtoul(cstr, &end, base));

where 'base' defaults to 0 which strtoul treats as base 10 unless the number starts with 0x.

The A packet one is the trickiest to clean up IMO.  We have two signals that can be useful.  debugserver response to the qGDBServerVersion packet,

(lldb) process plugin packet send qGDBServerVersion
  packet: qGDBServerVersion
response: name:debugserver;version:1205.2;

which hilariously no one else does.  This can tell us definitively that we're talking to debugserver.  And we can add a feature request to qSupported, like

send packet:  "qSupported:xmlRegisters=i386,arm,mips,arc;a-packet-base16;"
read packet:  "qXfer:features:read+;PacketSize=20000;qEcho+;a-packet-base16+"

This tells us that we're talking to a debugserver that can handle base16 numbers in A, and it will expect them.  And we can test if the remote stub is debugserver.  if it's debugserver and it did not say it supports this, then we need to send base10.

> 
>>> The other main issue LLDB has when using other GDB servers is the dynamic register information is not enough for debuggers to live on unless there is some hard coded support in the debugger that can help fill in register numberings. The GDB server has its own numbers, and that is great, but in order to truly be dynamic, we need to know the compiler register number (such as the reg numbers used for .eh_frame) and the DWARF register numbers for debug info that uses registers numbers (these are usually the same as the compiler register numbers, but they do sometimes differ (like x86)). LLDB also likes to know "generic" register numbers like which register it the PC (RIP for x86_64, EIP for x86, etc), SP, FP and a few more. lldb-server has extensions for this so that the dynamic register info it emits is enough for LLDB. We have added extra key/value pairs to the XML that is retrieved via "target.xml" so that it can be complete. See the function in lldb/source/Plugins/Process/gdb-remote/ProcessGDBRemote.cpp:
>>> 
>>> bool ParseRegisters(XMLNode feature_node, GdbServerTargetInfo &target_info,
>>>                     GDBRemoteDynamicRegisterInfo &dyn_reg_info, ABISP abi_sp,
>>>                     uint32_t &reg_num_remote, uint32_t &reg_num_local);
>>> 
>>> There are many keys we added: "encoding", "format", "gcc_regnum", "ehframe_regnum", "dwarf_regnum", "generic", "value_regnums", "invalidate_regnums", "dynamic_size_dwarf_expr_bytes"
>>> 
>> Yes, this is probably going to be the hardest part.  While working
>> on plugins, I've found LLDB register implementation very hard to figure
>> out, especially that the plugins seem to be a mix of new, old and older
>> solutions to the same problem.
>> We will probably need more ground-level design changes too.  IIRC lldb
>> sends YMM registers as a whole (i.e. with duplication with XMM
>> registers) while GDB sends them split like in XSAVE.  I'm not yet sure
>> how to handle this best -- if we don't want to push the extra complexity
>> on plugins, it might make sense to decouple the packet format from
>> the data passed to plugins.
> 
> Yes, this is definitely going to be the trickiest part, and probably deserves its own RFC. However, I want to note that in the past discussions, the consensus (between Jason and me, at least) has been to move away from this "rich" register information transfer. For one, because we have this information coded into the client anyway (as people want to communicate with gdb-like stubs).

Yes agree.  The remote stub should tell us a register name, what register it wants to use to refer to it.  Everything else is gravy (unnecessary).  If we want to support the g/G packets, I would like to get the offset into the g/G packet.

The rest of the register numbers -- eh_frame, dwarf, ABI argument register convenience names -- comes from the ABI.  We can use the register names to match these up - the stub says "I've got an 'r12' and I will refer to it as register number 53" and lldb looks up r12 and gets the rest of the register information from that.  It assumes we can all agree on register names, but otherwise I think it's fine.

As for xmm/ymm/zmm, Greg has a scheme where we can specify registers that overlap in the target.xml returned.  This allows us to say we have register al/ah/ax/eax/rax and that they're all the same actual register, so if any one of them is modified, they're all modified.  e.g.

  <reg name="rax" regnum="0" offset="0" bitsize="64" group="general" group_id="1" ehframe_regnum="0" dwarf_regnum="0" invalidate_regnums="0,21,37,53,57"/>

  <reg name="eax" regnum="21" offset="0" bitsize="32" group="general" group_id="1" value_regnums="0" invalidate_regnums="0,21,37,53,57"/>

  <reg name="ax" regnum="37" offset="0" bitsize="16" group="general" group_id="1" value_regnums="0" invalidate_regnums="0,21,37,53,57"/>

  <reg name="ah" regnum="53" offset="1" bitsize="8" group="general" group_id="1" value_regnums="0" invalidate_regnums="0,21,37,53,57"/>

  <reg name="al" regnum="57" offset="0" bitsize="8" group="general" group_id="1" value_regnums="0" invalidate_regnums="0,21,37,53,57"/>

debugserver sends up the value of rax at a stop along with all the GPRs,

 <  19> send packet: $vCont;s:183b166#8d
 < 626> read packet: $T05thread:183b166;threads:183b166;thread-pcs:100003502;00:f034000001000000;...

and I can print any of these variants without any more packets being sent, because lldb knows it already has all of them.

(lldb) reg read al ah ax eax rax
      al = 0xf0
      ah = 0x34
      ax = 0x34f0
     eax = 0x000034f0
     rax = 0x00000001000034f0  a.out`main at b.cc:3
(lldb) 

(I had packet logging turned on here, you'll have to take my word that no packets were sent ;)

debugserver describes xmm/ymm/zmm the same, so when I go to read one, it gets the full register contents -

(lldb) reg read xmm0
 <  23> send packet: $p63;thread:183b166;#9c
 < 132> read packet: $ffff0000000000000000000000ff0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000#00
    xmm0 = {0xff 0xff 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0xff 0x00 0x00}

(lldb) reg read ymm0
    ymm0 = {0xff 0xff 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0xff 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00}

(lldb) reg read zmm0
    zmm0 = {0xff 0xff 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0xff 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00}
(lldb) 

So my request to read xmm0 actually fetched zmm0.

J