[Lldb-commits] [lldb] f807d0b - [lldb/test] Fix for flakiness in TestNSDictionarySynthetic
Vedant Kumar via lldb-commits
lldb-commits at lists.llvm.org
Mon May 11 09:53:58 PDT 2020
Author: Vedant Kumar
Date: 2020-05-11T09:53:48-07:00
New Revision: f807d0b4acdb70c5a15919f6e9b02d8b212d1088
URL: https://github.com/llvm/llvm-project/commit/f807d0b4acdb70c5a15919f6e9b02d8b212d1088
DIFF: https://github.com/llvm/llvm-project/commit/f807d0b4acdb70c5a15919f6e9b02d8b212d1088.diff
LOG: [lldb/test] Fix for flakiness in TestNSDictionarySynthetic
Summary:
TestNSDictionarySynthetic sets up an NSURL which does not initialize its
_baseURL member. When the test runs and we print out the NSURL, we print
out some garbage memory pointed-to by the _baseURL member, like:
```
_baseURL = 0x0800010020004029 @"d��qX"
```
and this can cause a python unicode decoding error like:
```
UnicodeDecodeError: 'utf8' codec can't decode byte 0xa0 in position
10309: invalid start byte
```
There's a discrepancy here because lldb's StringPrinter facility tries
to only print out "printable" sequences (see: isprint32()), whereas python
rejects the StringPrinter output as invalid utf8. For the specific error
seen above, lldb's `isprint32(0xa0) = true`, even though 0xa0 is not
really "printable" in the usual sense.
The problem is that lldb and python disagree on what exactly is
"printable". Both have dismayingly hand-rolled utf8 validation code
(c.f. _Py_DecodeUTF8Ex), and I can't really tell which one is more
correct.
I tried replacing lldb's isprint32() with a call to libc's iswprint():
this satisfied python, but broke emoji printing :|.
Now, I believe that lldb (and python too) ought to just call into some
battle-tested utf library, and that we shouldn't aim for compatibility
with python's strict unicode decoding mode until then.
FWIW I ran this test under an ASanified lldb hundreds of times but
didn't turn up any other issues.
rdar://62941711
Reviewers: JDevlieghere, jingham, shafik
Subscribers: lldb-commits
Tags: #lldb
Differential Revision: https://reviews.llvm.org/D79645
Added:
Modified:
lldb/test/API/lldbtest.py
Removed:
################################################################################
diff --git a/lldb/test/API/lldbtest.py b/lldb/test/API/lldbtest.py
index c6331f6a0cac..fcc13a07eded 100644
--- a/lldb/test/API/lldbtest.py
+++ b/lldb/test/API/lldbtest.py
@@ -119,9 +119,14 @@ def execute(self, test, litConfig):
litConfig.maxIndividualTestTime)
if sys.version_info.major == 2:
- # In Python 2, string objects can contain Unicode characters.
- out = out.decode('utf-8')
- err = err.decode('utf-8')
+ # In Python 2, string objects can contain Unicode characters. Use
+ # the non-strict 'replace' decoding mode. We cannot use the strict
+ # mode right now because lldb's StringPrinter facility and the
+ # Python utf8 decoder have
diff erent interpretations of which
+ # characters are "printable". This leads to Python utf8 decoding
+ # exceptions even though lldb is behaving as expected.
+ out = out.decode('utf-8', 'replace')
+ err = err.decode('utf-8', 'replace')
output = """Script:\n--\n%s\n--\nExit Code: %d\n""" % (
' '.join(cmd), exitCode)
More information about the lldb-commits
mailing list