[Lldb-commits] [PATCH] D11187: Add jThreadsInfo support to lldb-server

Tue Jul 14 14:16:22 PDT 2015

tberghammer added inline comments.

================
Comment at: docs/lldb-gdb-remote.txt:1555-1575
@@ +1554,23 @@
+        "registers": {
+          "0":"8000000000000000",
+          "1":"0000000000000000",
+          "2":"20fabf5fff7f0000",
+          "3":"e8f8bf5fff7f0000",
+          "4":"0100000000000000",
+          "5":"d8f8bf5fff7f0000",
+          "6":"b0f8bf5fff7f0000",
+          "7":"20f4bf5fff7f0000",
+          "8":"8000000000000000",
+          "9":"61a8db78a61500db",
+          "10":"3200000000000000",
+          "11":"4602000000000000",
+          "12":"0000000000000000",
+          "13":"0000000000000000",
+          "14":"0000000000000000",
+          "15":"0000000000000000",
+          "16":"960b000001000000",
+          "17":"0202000000000000",
+          "18":"2b00000000000000",
+          "19":"0000000000000000",
+          "20":"0000000000000000"
+        },
----------------
labath wrote:
> jasonmolenda wrote:
> > jasonmolenda wrote:
> > > tberghammer wrote:
> > > > I would prefer to use hex numbers to be consistent with the stop reply packet but that change is out of scope for this change. The missing support for hex isn't an issue here because of the quotes.
> > > I agree with Tamas, the number-base assumptions in gdb-remote numbers are a nightmare; at this point even decimal can be misconstrued as hex.  I'd rather we just make it explicitly hex -- so instead of "14":, "0xe":.
> > > 
> > > I'm not a huge fan of sending back the registers as a series of bytes in target-native-endian order like this,
> > > 
> > > +          "5":"d8f8bf5fff7f0000",
> > > 
> > > I'd push for sending this back as a native JSON number object, so  "0x5":140734799804632 but this would cause problems if we try to transfer a 16-byte (or larger) vector register value - lldb's JSON parser is only designed to handle 64-bit integer values, as are most others I would expect - so maybe we have to live with this.  I'd argue for transposing to big-endian order but when we send the expedited memory contents we need to send those in target native endian order.
> > > 
> > > tl;dr: At least let's make the register numbers base 16 and prefixed with "0x".
> > Thinking about the value part of the register set, we can allow for both types of returns.  e.g. both of these could be allowed:
> > 
> > "0xe":"d8f8bf5fff7f0000"
> > "0xe":140734799804632
> > 
> > If the value is a string, it is a series of hex bytes in target-native-endian order.  If the value is an integer, it is an integer.
> > 
> > Once these are in a StructuredData object, lldb can detect whether the value is a string or integer and decode as appropriate.
> > 
> > debugserver could return integers for 64-bits-and-less register values, strings of raw bytes for vector regs.
> > I agree with Tamas, the number-base assumptions in gdb-remote numbers are a nightmare; at this point even decimal can be misconstrued as hex. I'd rather we just make it explicitly hex -- so instead of "14":, "0xe":.
> I understand the sentiment. We could potentially implement it this way.  How do you think this should be handled from a protocol specification language standpoint? I am asking because including a specific base prefix can lead people to believe they can pass any base number they want (decimal, octal?). Or should we say the number MUST be hex and MUST have 0x prefix?
> 
> > I'd push for sending this back as a native JSON number object, so "0x5":140734799804632 but this would cause problems if we try to transfer a 16-byte (or larger) vector register value - lldb's JSON parser is only designed to handle 64-bit integer values, as are most others I would expect - so maybe we have to live with this. I'd argue for transposing to big-endian order but when we send the expedited memory contents we need to send those in target native endian order.
> I would keep the register endianness as is, precisely because of vector registers. Since they contain multiple values, the whole register does not have a specific endianness, only the individual values do. If you now go around reversing the register byte-by-byte, you will produce something very odd. And reversing the individual parts sounds very fragile.
> 
> Also, I think the memory read is a completely separate issue. I think of the memory reads as a sequence of bytes -- it doesn't have endianness in the same way that an char[] doesn't have endianness until you know what is actually stored in there.
> 
> > If the value is a string, it is a series of hex bytes in target-native-endian order. If the value is an integer, it is an integer.
> I think increasing the number of formats will only add more confusion here. Sticking with one format, whatever it may be, is a much better option in my view.
My primary opinion is we should keep the format as consistent with the stop reply packet as possible so both information can be treated just as a list of "key:value" pairs with the same list of keys. From the readers perspective it isn't trivial to understand but I prefer consistency over local readability.

I don't like the idea of sending the register values as a number because if somebody just reads the packet then a decimal number usually don't mean anything and it isn't easier to parse it in the code. For me it is quite natural that we send the register values as hex data without the "0x" prefix but probably because we do it in LLDB quite a lot.

If we are changing the format of this packet I would like to see 2 more changes.
* Instead of a list of thread infos send back an object where the keys are the thread ids (as string) and the value is the current data content (without thread id)
* Move out the "memory" section from the thread specific object to the top level scope (e.g. to the same level where we have the tids as keys if we make the other change) because that info is valid at process level. It might also help in decreasing the amount of memory we are transferring with eliminating some redundancy.

http://reviews.llvm.org/D11187