[lldb-dev] Python object lifetimes affect the reliability of tests

Thu Oct 15 11:36:26 PDT 2015

Couldn't we just change DeleteTarget to make sure everything is unmapped?

On Thu, Oct 15, 2015 at 11:34 AM Zachary Turner via lldb-dev <
lldb-dev at lists.llvm.org> wrote:

> To add more evidence for this, here's a small repro:
>
> import sys
>
> print "sys.exc_info() = ", "Empty" if sys.exc_info() == (None, None, None)
> else "Valid"
> try:
>     raise Exception
> except Exception, e:
>     print "sys.exc_info() = ", "Empty" if sys.exc_info() == (None, None,
> None) else "Valid"
>     pass
>
> print "sys.exc_info() = ", "Empty" if sys.exc_info() == (None, None, None)
> else "Valid"
> print "e = ", "Bound" if 'e' in vars() else "Unbound"
> pass
>
> For me this prints
> sys.exc_info() =  Empty
> sys.exc_info() =  Valid
> sys.exc_info() =  Valid
> e =  Bound
>
> On Thu, Oct 15, 2015 at 11:21 AM Zachary Turner <zturner at google.com>
> wrote:
>
>> We actually do already to the self.dbg.DeleteTarget(target), and that's
>> the line that's failing.  The reason it's failing is because the 'sc'
>> reference is still alive, which is holding an mmap, which causes a
>> mandatory file lock on Windows.
>>
>> The diagnostics went pretty deep into python internals, but I think we
>> might have figured it out.  I don't know if this is a bug in Python, but I
>> think we'd probably need to ask Guido to be sure :)
>>
>> As far as we can tell, what happens is that on the exceptional codepath
>> (e.g the assert fails), you walk back up the stack until you get to the
>> except handler.  This exception handler is in TestCase.run().  After it
>> handles the exception it goes and runs teardown.  However, for some reason,
>> Python is still holding a strong reference to the *traceback*, even though
>> we're completely out of the finally block.  What this means is that if you
>> call `sys.exc_info()` *even after you've exited the finally block, it still
>> returns info about the previous exception that's not even being handled
>> anymore.  I would have expected this to be gone since there's no exception
>> in-fligth anymore.  So basically, Python is still holding a reference to
>> the active exception, the exception holds the stack frame, the stack frame
>> holds the test method, the test method has locals, one of which is a
>> SymbolList, a member of which is symbol context, which has the file locked.
>>
>> Our best guess is that if you have something like this:
>>
>> def foo():
>>     try:
>>        # Do stuff
>>     except Exception, e:
>>        pass
>>     # Do more stuff
>>
>> that if the exceptional path is executed, then both e and sys.exc_info()
>> are alive *while* do more stuff is happening.  We've found two ways to
>> fixthis:
>>
>> 1) Change to this:
>> def foo():
>>     try:
>>        # Do stuff
>>     except Exception, e:
>>        pass
>>     del e
>>     sys.exc_clear()
>>     # Do more stuff
>>
>> 2) Put the try / except inside a function.  When the function returns,
>> sys.exc_info() is cleared.
>>
>> I like 2 better, but we're still testing some more to make sure this
>> really fixes it 100% of the time.
>>
>> On Thu, Oct 15, 2015 at 10:25 AM Greg Clayton via lldb-dev <
>> lldb-dev at lists.llvm.org> wrote:
>>
>>>
>>> > On Oct 15, 2015, at 8:50 AM, Adrian McCarthy via lldb-dev <
>>> lldb-dev at lists.llvm.org> wrote:
>>> >
>>> > I've tracked down a source of flakiness in tests on Windows to Python
>>> object lifetimes and the SB interface, and I'm wondering how best to handle
>>> it.
>>> >
>>> > Consider this portion of a test from TestTargetAPI:
>>> >
>>> >  def find_functions(self, exe_name):
>>> >      """Exercise SBTaget.FindFunctions() API."""
>>> >      exe = os.path.join(os.getcwd(), exe_name)
>>> >
>>> >      # Create a target by the debugger.
>>> >      target = self.dbg.CreateTarget(exe)
>>> >      self.assertTrue(target, VALID_TARGET)
>>> >      list = target.FindFunctions('c', lldb.eFunctionNameTypeAuto)
>>> >      self.assertTrue(list.GetSize() == 1)
>>> >
>>> >      for sc in list:
>>> >          self.assertTrue(sc.GetModule().GetFileSpec().GetFilename() ==
>>> exe_name)
>>> >          self.assertTrue(sc.GetSymbol().GetName() == 'c')
>>> >
>>> > The local variables go out of scope when the function exits, but the
>>> SB (C++) objects they represent aren't (always) immediately destroyed.  At
>>> least some of these objects keep references to the executable module in the
>>> shared module list, so when the test framework cleans up and calls
>>> `SBDebugger::DeleteTarget`, the module isn't orphaned, so LLDB maintains an
>>> open handle to the executable.
>>>
>>> Creating a target with:
>>>
>>>         target = self.dbg.CreateTarget(exe)
>>>
>>> Will give you a SBTarget object that has a strong reference to the
>>> target, but the debugger still has a copy in its target list, so the
>>> SBTarget isn't designed to delete the object when the target variable goes
>>> out of scope. If you want the target to be deleted, you actually have to
>>> call through to the debugger with:
>>>
>>>
>>>  bool
>>>  SBDebugger:DeleteTarget (lldb::SBTarget &target);
>>>
>>>
>>> So the right way to clean up the target is:
>>>
>>>  self.dbg.DeleteTarget(target);
>>>
>>> Even though there might be code within LLDB that has a valid shared
>>> pointer to the lldb_private::Target still, it calls
>>> lldb_private::Target::Destroy() which clears out most instance variable
>>> (the module list, the process, any plug-ins, etc).
>>>
>>> SBTarget objects have strong references so that they _can_ keep the
>>> object alive if needed in case someone else destroys the target on another
>>> thread, but they don't control the lifetime of the target.
>>>
>>> Other objects have weak references to the objects: SBProcess, SBThread,
>>> SBFrame. If the objects are actually destroyed already, the weak pointer
>>> won't be able to get a valid shared pointer to the underlying object
>>> and any SB API calls on these objects will return error, none, zero,
>>> etc...
>>>
>>> >
>>> > The result of the lingering handle is that, when the next test case in
>>> the test suite tries to re-build the executable, it fails because the file
>>> is not writable.  (This is problematic on Windows because the file system
>>> works differently in this regard than Unix derivatives.)  Every subsequent
>>> case in the test suite fails.
>>> >
>>> > I managed to make the test work reliably by rewriting it like this:
>>> >
>>> >  def find_functions(self, exe_name):
>>> >      """Exercise SBTaget.FindFunctions() API."""
>>> >      exe = os.path.join(os.getcwd(), exe_name)
>>> >
>>> >      # Create a target by the debugger.
>>> >      target = self.dbg.CreateTarget(exe)
>>> >      self.assertTrue(target, VALID_TARGET)
>>> >
>>> >      try:
>>> >          list = target.FindFunctions('c', lldb.eFunctionNameTypeAuto)
>>> >          self.assertTrue(list.GetSize() == 1)
>>> >
>>> >          for sc in list:
>>> >              try:
>>> >
>>> self.assertTrue(sc.GetModule().GetFileSpec().GetFilename() == exe_name)
>>> >                  self.assertTrue(sc.GetSymbol().GetName() == 'c')
>>> >              finally:
>>> >                  del sc
>>> >
>>> >      finally:
>>> >          del list
>>> >
>>> > The finally blocks ensure that the corresponding C++ objects are
>>> destroyed, even if the function exits as a result of a Python exception
>>> (e.g., if one of the assertion expressions is false and the code throws an
>>> exception).  Since the objects are destroyed, the reference counts are back
>>> to where they should be, and the orphaned module is closed when the target
>>> is deleted.
>>> >
>>> > But this is ugly and maintaining it would be error prone.  Is there a
>>> better way to address this?
>>>
>>> So you should be able to fix this by deleting the target with
>>> "self.dbg.DeleteTarget(target)"
>>>
>>> We could change all tests over to always store any targets they create
>>> in the test object itself:
>>>
>>> self.target = self.dbg.CreateTarget(exe)
>>>
>>> Then the test suite could check for the existance of "self.target" and
>>> if it exists, it could call "self.dbg.DeleteTarget(self.target)"
>>> automatically to avoid such issues?
>>>
>>>
>>>
>>> _______________________________________________
>>> lldb-dev mailing list
>>> lldb-dev at lists.llvm.org
>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev
>>>
>> _______________________________________________
> lldb-dev mailing list
> lldb-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/lldb-dev/attachments/20151015/f5de33aa/attachment.html>