[lldb-dev] UnicodeDecodeError for serialize SBValue description

Sat Mar 26 23:58:44 PDT 2016

Btw: after patching with Siva's fix http://reviews.llvm.org/D18008, the
first field 'small_' is fixed, however the second field 'ml_' still emits
garbage:

(lldb) fr v corpus
(const string &const) corpus = error: summary string parsing error: {
  store_ = {
     = {
      small_ = "www"
      ml_ = (data_ =
"��UH\x89�H�}�H\x8bE�]ÐUH\x89�H��H\x89}�H\x8bE�H\x89��~\xb4��\x90��UH\x89�SH\x83�H\x89}�H�u�H�E�H���\x9e���H\x8b\x18H\x8bE�H���O\xb4��H\x89ƿ\b",
size_ = 0, capacity_ = 1441151880758558720)
    }
  }
}

Thanks for any info regarding how to encode this string.

Jeffrey

On Sat, Mar 26, 2016 at 3:34 PM, Jeffrey Tan <jeffrey.fudan at gmail.com>
wrote:

> Follow-up for the previous question:
>
> Our python code is trying to call json.dumps to serialize the variable
> evaluation result into string block and send to IDE via RPC, however it
> failed with "UnicodeDecodeError: 'utf8' codec can't decode byte 0xc9 in
> position 10: invalid continuation byte" because SBValue.description seems
> to return non-utf-8 string:
>
> (lldb) fr v
> *error: biggrep_master_server_async 0x10b9a91a: DW_TAG_member
> '_M_pod_data' refers to type 0x10bb1e99 which extends beyond the bounds of
> 0x10b9a901*
> *error: biggrep_master_server_async 0x10b98edc: DW_TAG_member 'small_'
> refers to type 0x10bb1d9f which extends beyond the bounds of 0x10b98ed3*
> *error: biggrep_master_server_async 0x10baf034: DW_TAG_member '__size'
> refers to type 0x10baf04d which extends beyond the bounds of 0x10baefae*
> (facebook::biggrep::BigGrepMasterAsync *) this = 0x00007fd14d374fd0
> (const string &const) corpus = error: summary string parsing error: {
>   store_ = {
>      = {
>       small_ = {}
>       *ml_ = (data_ =
> "��UH\x89�H�}�H\x8bE�]ÐUH\x89�H��H\x89}�H\x8bE�H\x89��~\xb4��\x90��UH\x89�SH\x83�H\x89}�H�u�H�E�H���\x9e���H\x8b\x18H\x8bE�H���O\xb4��H\x89ƿ\b",
> size_ = 0, capacity_ = 1441151880758558720)*
>     }
>   }
> }
>
>
> File
> "/data/users/jeffreytan/fbsource/fbobjc/Tools/Nuclide/pkg/nuclide-debugger-lldb-server/scripts/chromedebugger.py",
> line 91, in received_message
> *    response_in_json = json.dumps(response);*
>   File "/usr/lib64/python2.6/json/__init__.py", line 230, in dumps
>     return _default_encoder.encode(obj)
>   File "/usr/lib64/python2.6/json/encoder.py", line 367, in encode
>     chunks = list(self.iterencode(o))
>   File "/usr/lib64/python2.6/json/encoder.py", line 309, in _iterencode
>     for chunk in self._iterencode_dict(o, markers):
>   File "/usr/lib64/python2.6/json/encoder.py", line 275, in
> _iterencode_dict
>     for chunk in self._iterencode(value, markers):
>   File "/usr/lib64/python2.6/json/encoder.py", line 309, in _iterencode
>     for chunk in self._iterencode_dict(o, markers):
>   File "/usr/lib64/python2.6/json/encoder.py", line 275, in
> _iterencode_dict
>     for chunk in self._iterencode(value, markers):
>   File "/usr/lib64/python2.6/json/encoder.py", line 306, in _iterencode
>     for chunk in self._iterencode_list(o, markers):
>   File "/usr/lib64/python2.6/json/encoder.py", line 204, in
> _iterencode_list
>     for chunk in self._iterencode(value, markers):
>   File "/usr/lib64/python2.6/json/encoder.py", line 309, in _iterencode
>     for chunk in self._iterencode_dict(o, markers):
>   File "/usr/lib64/python2.6/json/encoder.py", line 275, in
> _iterencode_dict
>     for chunk in self._iterencode(value, markers):
>   File "/usr/lib64/python2.6/json/encoder.py", line 309, in _iterencode
>     for chunk in self._iterencode_dict(o, markers):
>   File "/usr/lib64/python2.6/json/encoder.py", line 275, in
> _iterencode_dict
>     for chunk in self._iterencode(value, markers):
>   File "/usr/lib64/python2.6/json/encoder.py", line 294, in _iterencode
>     yield encoder(o)
> *UnicodeDecodeError: 'utf8' codec can't decode byte 0xc9 in position 10:
> invalid continuation byte*
>
> Question:
> Is the non utf-8 string expected or just gabage data because of the DW_TAG_member
> error? What is the proper way find out the string encoding and serialize
> using* json.dumps()*?
>
> Jeffrey
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/lldb-dev/attachments/20160326/013f4198/attachment.html>