[lldb-dev] Python object lifetimes affect the reliability of tests
Zachary Turner via lldb-dev
lldb-dev at lists.llvm.org
Thu Oct 15 11:33:55 PDT 2015
To add more evidence for this, here's a small repro:
import sys
print "sys.exc_info() = ", "Empty" if sys.exc_info() == (None, None, None)
else "Valid"
try:
raise Exception
except Exception, e:
print "sys.exc_info() = ", "Empty" if sys.exc_info() == (None, None,
None) else "Valid"
pass
print "sys.exc_info() = ", "Empty" if sys.exc_info() == (None, None, None)
else "Valid"
print "e = ", "Bound" if 'e' in vars() else "Unbound"
pass
For me this prints
sys.exc_info() = Empty
sys.exc_info() = Valid
sys.exc_info() = Valid
e = Bound
On Thu, Oct 15, 2015 at 11:21 AM Zachary Turner <zturner at google.com> wrote:
> We actually do already to the self.dbg.DeleteTarget(target), and that's
> the line that's failing. The reason it's failing is because the 'sc'
> reference is still alive, which is holding an mmap, which causes a
> mandatory file lock on Windows.
>
> The diagnostics went pretty deep into python internals, but I think we
> might have figured it out. I don't know if this is a bug in Python, but I
> think we'd probably need to ask Guido to be sure :)
>
> As far as we can tell, what happens is that on the exceptional codepath
> (e.g the assert fails), you walk back up the stack until you get to the
> except handler. This exception handler is in TestCase.run(). After it
> handles the exception it goes and runs teardown. However, for some reason,
> Python is still holding a strong reference to the *traceback*, even though
> we're completely out of the finally block. What this means is that if you
> call `sys.exc_info()` *even after you've exited the finally block, it still
> returns info about the previous exception that's not even being handled
> anymore. I would have expected this to be gone since there's no exception
> in-fligth anymore. So basically, Python is still holding a reference to
> the active exception, the exception holds the stack frame, the stack frame
> holds the test method, the test method has locals, one of which is a
> SymbolList, a member of which is symbol context, which has the file locked.
>
> Our best guess is that if you have something like this:
>
> def foo():
> try:
> # Do stuff
> except Exception, e:
> pass
> # Do more stuff
>
> that if the exceptional path is executed, then both e and sys.exc_info()
> are alive *while* do more stuff is happening. We've found two ways to
> fixthis:
>
> 1) Change to this:
> def foo():
> try:
> # Do stuff
> except Exception, e:
> pass
> del e
> sys.exc_clear()
> # Do more stuff
>
> 2) Put the try / except inside a function. When the function returns,
> sys.exc_info() is cleared.
>
> I like 2 better, but we're still testing some more to make sure this
> really fixes it 100% of the time.
>
> On Thu, Oct 15, 2015 at 10:25 AM Greg Clayton via lldb-dev <
> lldb-dev at lists.llvm.org> wrote:
>
>>
>> > On Oct 15, 2015, at 8:50 AM, Adrian McCarthy via lldb-dev <
>> lldb-dev at lists.llvm.org> wrote:
>> >
>> > I've tracked down a source of flakiness in tests on Windows to Python
>> object lifetimes and the SB interface, and I'm wondering how best to handle
>> it.
>> >
>> > Consider this portion of a test from TestTargetAPI:
>> >
>> > def find_functions(self, exe_name):
>> > """Exercise SBTaget.FindFunctions() API."""
>> > exe = os.path.join(os.getcwd(), exe_name)
>> >
>> > # Create a target by the debugger.
>> > target = self.dbg.CreateTarget(exe)
>> > self.assertTrue(target, VALID_TARGET)
>> > list = target.FindFunctions('c', lldb.eFunctionNameTypeAuto)
>> > self.assertTrue(list.GetSize() == 1)
>> >
>> > for sc in list:
>> > self.assertTrue(sc.GetModule().GetFileSpec().GetFilename() ==
>> exe_name)
>> > self.assertTrue(sc.GetSymbol().GetName() == 'c')
>> >
>> > The local variables go out of scope when the function exits, but the SB
>> (C++) objects they represent aren't (always) immediately destroyed. At
>> least some of these objects keep references to the executable module in the
>> shared module list, so when the test framework cleans up and calls
>> `SBDebugger::DeleteTarget`, the module isn't orphaned, so LLDB maintains an
>> open handle to the executable.
>>
>> Creating a target with:
>>
>> target = self.dbg.CreateTarget(exe)
>>
>> Will give you a SBTarget object that has a strong reference to the
>> target, but the debugger still has a copy in its target list, so the
>> SBTarget isn't designed to delete the object when the target variable goes
>> out of scope. If you want the target to be deleted, you actually have to
>> call through to the debugger with:
>>
>>
>> bool
>> SBDebugger:DeleteTarget (lldb::SBTarget &target);
>>
>>
>> So the right way to clean up the target is:
>>
>> self.dbg.DeleteTarget(target);
>>
>> Even though there might be code within LLDB that has a valid shared
>> pointer to the lldb_private::Target still, it calls
>> lldb_private::Target::Destroy() which clears out most instance variable
>> (the module list, the process, any plug-ins, etc).
>>
>> SBTarget objects have strong references so that they _can_ keep the
>> object alive if needed in case someone else destroys the target on another
>> thread, but they don't control the lifetime of the target.
>>
>> Other objects have weak references to the objects: SBProcess, SBThread,
>> SBFrame. If the objects are actually destroyed already, the weak pointer
>> won't be able to get a valid shared pointer to the underlying object
>> and any SB API calls on these objects will return error, none, zero,
>> etc...
>>
>> >
>> > The result of the lingering handle is that, when the next test case in
>> the test suite tries to re-build the executable, it fails because the file
>> is not writable. (This is problematic on Windows because the file system
>> works differently in this regard than Unix derivatives.) Every subsequent
>> case in the test suite fails.
>> >
>> > I managed to make the test work reliably by rewriting it like this:
>> >
>> > def find_functions(self, exe_name):
>> > """Exercise SBTaget.FindFunctions() API."""
>> > exe = os.path.join(os.getcwd(), exe_name)
>> >
>> > # Create a target by the debugger.
>> > target = self.dbg.CreateTarget(exe)
>> > self.assertTrue(target, VALID_TARGET)
>> >
>> > try:
>> > list = target.FindFunctions('c', lldb.eFunctionNameTypeAuto)
>> > self.assertTrue(list.GetSize() == 1)
>> >
>> > for sc in list:
>> > try:
>> >
>> self.assertTrue(sc.GetModule().GetFileSpec().GetFilename() == exe_name)
>> > self.assertTrue(sc.GetSymbol().GetName() == 'c')
>> > finally:
>> > del sc
>> >
>> > finally:
>> > del list
>> >
>> > The finally blocks ensure that the corresponding C++ objects are
>> destroyed, even if the function exits as a result of a Python exception
>> (e.g., if one of the assertion expressions is false and the code throws an
>> exception). Since the objects are destroyed, the reference counts are back
>> to where they should be, and the orphaned module is closed when the target
>> is deleted.
>> >
>> > But this is ugly and maintaining it would be error prone. Is there a
>> better way to address this?
>>
>> So you should be able to fix this by deleting the target with
>> "self.dbg.DeleteTarget(target)"
>>
>> We could change all tests over to always store any targets they create in
>> the test object itself:
>>
>> self.target = self.dbg.CreateTarget(exe)
>>
>> Then the test suite could check for the existance of "self.target" and if
>> it exists, it could call "self.dbg.DeleteTarget(self.target)" automatically
>> to avoid such issues?
>>
>>
>>
>> _______________________________________________
>> lldb-dev mailing list
>> lldb-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/lldb-dev/attachments/20151015/191d6163/attachment-0001.html>
More information about the lldb-dev
mailing list