[lldb-dev] UnicodeDecodeError for serialize SBValue description

Sat Mar 26 15:34:58 PDT 2016

Follow-up for the previous question:

Our python code is trying to call json.dumps to serialize the variable
evaluation result into string block and send to IDE via RPC, however it
failed with "UnicodeDecodeError: 'utf8' codec can't decode byte 0xc9 in
position 10: invalid continuation byte" because SBValue.description seems
to return non-utf-8 string:

(lldb) fr v
*error: biggrep_master_server_async 0x10b9a91a: DW_TAG_member '_M_pod_data'
refers to type 0x10bb1e99 which extends beyond the bounds of 0x10b9a901*
*error: biggrep_master_server_async 0x10b98edc: DW_TAG_member 'small_'
refers to type 0x10bb1d9f which extends beyond the bounds of 0x10b98ed3*
*error: biggrep_master_server_async 0x10baf034: DW_TAG_member '__size'
refers to type 0x10baf04d which extends beyond the bounds of 0x10baefae*
(facebook::biggrep::BigGrepMasterAsync *) this = 0x00007fd14d374fd0
(const string &const) corpus = error: summary string parsing error: {
  store_ = {
     = {
      small_ = {}
      *ml_ = (data_ =
"��UH\x89�H�}�H\x8bE�]ÐUH\x89�H��H\x89}�H\x8bE�H\x89��~\xb4��\x90��UH\x89�SH\x83�H\x89}�H�u�H�E�H���\x9e���H\x8b\x18H\x8bE�H���O\xb4��H\x89ƿ\b",
size_ = 0, capacity_ = 1441151880758558720)*
    }
  }
}

File
"/data/users/jeffreytan/fbsource/fbobjc/Tools/Nuclide/pkg/nuclide-debugger-lldb-server/scripts/chromedebugger.py",
line 91, in received_message
*    response_in_json = json.dumps(response);*
  File "/usr/lib64/python2.6/json/__init__.py", line 230, in dumps
    return _default_encoder.encode(obj)
  File "/usr/lib64/python2.6/json/encoder.py", line 367, in encode
    chunks = list(self.iterencode(o))
  File "/usr/lib64/python2.6/json/encoder.py", line 309, in _iterencode
    for chunk in self._iterencode_dict(o, markers):
  File "/usr/lib64/python2.6/json/encoder.py", line 275, in _iterencode_dict
    for chunk in self._iterencode(value, markers):
  File "/usr/lib64/python2.6/json/encoder.py", line 309, in _iterencode
    for chunk in self._iterencode_dict(o, markers):
  File "/usr/lib64/python2.6/json/encoder.py", line 275, in _iterencode_dict
    for chunk in self._iterencode(value, markers):
  File "/usr/lib64/python2.6/json/encoder.py", line 306, in _iterencode
    for chunk in self._iterencode_list(o, markers):
  File "/usr/lib64/python2.6/json/encoder.py", line 204, in _iterencode_list
    for chunk in self._iterencode(value, markers):
  File "/usr/lib64/python2.6/json/encoder.py", line 309, in _iterencode
    for chunk in self._iterencode_dict(o, markers):
  File "/usr/lib64/python2.6/json/encoder.py", line 275, in _iterencode_dict
    for chunk in self._iterencode(value, markers):
  File "/usr/lib64/python2.6/json/encoder.py", line 309, in _iterencode
    for chunk in self._iterencode_dict(o, markers):
  File "/usr/lib64/python2.6/json/encoder.py", line 275, in _iterencode_dict
    for chunk in self._iterencode(value, markers):
  File "/usr/lib64/python2.6/json/encoder.py", line 294, in _iterencode
    yield encoder(o)
*UnicodeDecodeError: 'utf8' codec can't decode byte 0xc9 in position 10:
invalid continuation byte*

Question:
Is the non utf-8 string expected or just gabage data because of the
DW_TAG_member
error? What is the proper way find out the string encoding and serialize
using* json.dumps()*?

Jeffrey
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/lldb-dev/attachments/20160326/f5e49a10/attachment.html>