<div dir="ltr">Btw: after patching with Siva's fix <a href="http://reviews.llvm.org/D18008">http://reviews.llvm.org/D18008</a>, the first field 'small_' is fixed, however the second field 'ml_' still emits garbage:<div><br></div><div><div>(lldb) fr v corpus</div><div>(const string &const) corpus = error: summary string parsing error: {</div><div> store_ = {</div><div> = {</div><div> small_ = "www"</div><div> ml_ = (data_ = "��UH\x89�H�}�H\x8bE�]ÐUH\x89�H��H\x89}�H\x8bE�H\x89��~\xb4��\x90��UH\x89�SH\x83�H\x89}�H�u�H�E�H���\x9e���H\x8b\x18H\x8bE�H���O\xb4��H\x89ƿ\b", size_ = 0, capacity_ = 1441151880758558720)</div><div> }</div><div> }</div><div>}</div></div><div><br></div><div>Thanks for any info regarding how to encode this string. </div><div><br></div><div>Jeffrey</div></div><div class="gmail_extra"><br><div class="gmail_quote">On Sat, Mar 26, 2016 at 3:34 PM, Jeffrey Tan <span dir="ltr"><<a href="mailto:jeffrey.fudan@gmail.com" target="_blank">jeffrey.fudan@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr">Follow-up for the previous question:<div><br></div><div>Our python code is trying to call json.dumps to serialize the variable evaluation result into string block and send to IDE via RPC, however it failed with "UnicodeDecodeError: 'utf8' codec can't decode byte 0xc9 in position 10: invalid continuation byte" because SBValue.description seems to return non-utf-8 string:</div><div><br></div><div><div style="font-size:12.8px">(lldb) fr v</div><div style="font-size:12.8px"><b>error: biggrep_master_server_async 0x10b9a91a: DW_TAG_member '_M_pod_data' refers to type 0x10bb1e99 which extends beyond the bounds of 0x10b9a901</b></div><div style="font-size:12.8px"><b>error: biggrep_master_server_async 0x10b98edc: DW_TAG_member 'small_' refers to type 0x10bb1d9f which extends beyond the bounds of 0x10b98ed3</b></div><div style="font-size:12.8px"><b>error: biggrep_master_server_async 0x10baf034: DW_TAG_member '__size' refers to type 0x10baf04d which extends beyond the bounds of 0x10baefae</b></div><div style="font-size:12.8px">(facebook::biggrep::BigGrepMasterAsync *) this = 0x00007fd14d374fd0</div><div style="font-size:12.8px">(const string &const) corpus = error: summary string parsing error: {</div><div style="font-size:12.8px"> store_ = {</div><div style="font-size:12.8px"> = {</div><div style="font-size:12.8px"> small_ = {}</div><div style="font-size:12.8px"> <b>ml_ = (data_ = "��UH\x89�H�}�H\x8bE�]ÐUH\x89�H��H\x89}�H\x8bE�H\x89��~\xb4��\x90��UH\x89�SH\x83�H\x89}�H�u�H�E�H���\x9e���H\x8b\x18H\x8bE�H���O\xb4��H\x89ƿ\b", size_ = 0, capacity_ = 1441151880758558720)</b></div><div style="font-size:12.8px"> }</div><div style="font-size:12.8px"> }</div><div style="font-size:12.8px">}</div></div><div><br></div><div><br></div><div><div>File "/data/users/jeffreytan/fbsource/fbobjc/Tools/Nuclide/pkg/nuclide-debugger-lldb-server/scripts/chromedebugger.py", line 91, in received_message</div><div><b> response_in_json = json.dumps(response);</b></div><div> File "/usr/lib64/python2.6/json/__init__.py", line 230, in dumps</div><div> return _default_encoder.encode(obj)</div><div> File "/usr/lib64/python2.6/json/encoder.py", line 367, in encode</div><div> chunks = list(self.iterencode(o))</div><div> File "/usr/lib64/python2.6/json/encoder.py", line 309, in _iterencode</div><div> for chunk in self._iterencode_dict(o, markers):</div><div> File "/usr/lib64/python2.6/json/encoder.py", line 275, in _iterencode_dict</div><div> for chunk in self._iterencode(value, markers):</div><div> File "/usr/lib64/python2.6/json/encoder.py", line 309, in _iterencode</div><div> for chunk in self._iterencode_dict(o, markers):</div><div> File "/usr/lib64/python2.6/json/encoder.py", line 275, in _iterencode_dict</div><div> for chunk in self._iterencode(value, markers):</div><div> File "/usr/lib64/python2.6/json/encoder.py", line 306, in _iterencode</div><div> for chunk in self._iterencode_list(o, markers):</div><div> File "/usr/lib64/python2.6/json/encoder.py", line 204, in _iterencode_list</div><div> for chunk in self._iterencode(value, markers):</div><div> File "/usr/lib64/python2.6/json/encoder.py", line 309, in _iterencode</div><div> for chunk in self._iterencode_dict(o, markers):</div><div> File "/usr/lib64/python2.6/json/encoder.py", line 275, in _iterencode_dict</div><div> for chunk in self._iterencode(value, markers):</div><div> File "/usr/lib64/python2.6/json/encoder.py", line 309, in _iterencode</div><div> for chunk in self._iterencode_dict(o, markers):</div><div> File "/usr/lib64/python2.6/json/encoder.py", line 275, in _iterencode_dict</div><div> for chunk in self._iterencode(value, markers):</div><div> File "/usr/lib64/python2.6/json/encoder.py", line 294, in _iterencode</div><div> yield encoder(o)</div><div><b>UnicodeDecodeError: 'utf8' codec can't decode byte 0xc9 in position 10: invalid continuation byte</b></div></div><div><b><br></b></div><div>Question:</div><div>Is the non utf-8 string expected or just gabage data because of the <span style="font-size:12.8px">DW_TAG_member error? What is the proper way find out the string encoding and serialize using</span><b> json.dumps()</b>?</div><span class="HOEnZb"><font color="#888888"><div><br></div><div>Jeffrey</div></font></span></div>
</blockquote></div><br></div>