[lldb-dev] Python object lifetimes affect the reliability of tests

Thu Oct 15 11:21:57 PDT 2015

We actually do already to the self.dbg.DeleteTarget(target), and that's the
line that's failing.  The reason it's failing is because the 'sc' reference
is still alive, which is holding an mmap, which causes a mandatory file
lock on Windows.

The diagnostics went pretty deep into python internals, but I think we
might have figured it out.  I don't know if this is a bug in Python, but I
think we'd probably need to ask Guido to be sure :)

As far as we can tell, what happens is that on the exceptional codepath
(e.g the assert fails), you walk back up the stack until you get to the
except handler.  This exception handler is in TestCase.run().  After it
handles the exception it goes and runs teardown.  However, for some reason,
Python is still holding a strong reference to the *traceback*, even though
we're completely out of the finally block.  What this means is that if you
call `sys.exc_info()` *even after you've exited the finally block, it still
returns info about the previous exception that's not even being handled
anymore.  I would have expected this to be gone since there's no exception
in-fligth anymore.  So basically, Python is still holding a reference to
the active exception, the exception holds the stack frame, the stack frame
holds the test method, the test method has locals, one of which is a
SymbolList, a member of which is symbol context, which has the file locked.

Our best guess is that if you have something like this:

def foo():
    try:
       # Do stuff
    except Exception, e:
       pass
    # Do more stuff

that if the exceptional path is executed, then both e and sys.exc_info()
are alive *while* do more stuff is happening.  We've found two ways to
fixthis:

1) Change to this:
def foo():
    try:
       # Do stuff
    except Exception, e:
       pass
    del e
    sys.exc_clear()
    # Do more stuff

2) Put the try / except inside a function.  When the function returns,
sys.exc_info() is cleared.

I like 2 better, but we're still testing some more to make sure this really
fixes it 100% of the time.

On Thu, Oct 15, 2015 at 10:25 AM Greg Clayton via lldb-dev <
lldb-dev at lists.llvm.org> wrote:

>
> > On Oct 15, 2015, at 8:50 AM, Adrian McCarthy via lldb-dev <
> lldb-dev at lists.llvm.org> wrote:
> >
> > I've tracked down a source of flakiness in tests on Windows to Python
> object lifetimes and the SB interface, and I'm wondering how best to handle
> it.
> >
> > Consider this portion of a test from TestTargetAPI:
> >
> >  def find_functions(self, exe_name):
> >      """Exercise SBTaget.FindFunctions() API."""
> >      exe = os.path.join(os.getcwd(), exe_name)
> >
> >      # Create a target by the debugger.
> >      target = self.dbg.CreateTarget(exe)
> >      self.assertTrue(target, VALID_TARGET)
> >      list = target.FindFunctions('c', lldb.eFunctionNameTypeAuto)
> >      self.assertTrue(list.GetSize() == 1)
> >
> >      for sc in list:
> >          self.assertTrue(sc.GetModule().GetFileSpec().GetFilename() ==
> exe_name)
> >          self.assertTrue(sc.GetSymbol().GetName() == 'c')
> >
> > The local variables go out of scope when the function exits, but the SB
> (C++) objects they represent aren't (always) immediately destroyed.  At
> least some of these objects keep references to the executable module in the
> shared module list, so when the test framework cleans up and calls
> `SBDebugger::DeleteTarget`, the module isn't orphaned, so LLDB maintains an
> open handle to the executable.
>
> Creating a target with:
>
>         target = self.dbg.CreateTarget(exe)
>
> Will give you a SBTarget object that has a strong reference to the target,
> but the debugger still has a copy in its target list, so the SBTarget isn't
> designed to delete the object when the target variable goes out of scope.
> If you want the target to be deleted, you actually have to call through to
> the debugger with:
>
>
>  bool
>  SBDebugger:DeleteTarget (lldb::SBTarget &target);
>
>
> So the right way to clean up the target is:
>
>  self.dbg.DeleteTarget(target);
>
> Even though there might be code within LLDB that has a valid shared
> pointer to the lldb_private::Target still, it calls
> lldb_private::Target::Destroy() which clears out most instance variable
> (the module list, the process, any plug-ins, etc).
>
> SBTarget objects have strong references so that they _can_ keep the object
> alive if needed in case someone else destroys the target on another thread,
> but they don't control the lifetime of the target.
>
> Other objects have weak references to the objects: SBProcess, SBThread,
> SBFrame. If the objects are actually destroyed already, the weak pointer
> won't be able to get a valid shared pointer to the underlying object
> and any SB API calls on these objects will return error, none, zero, etc...
>
> >
> > The result of the lingering handle is that, when the next test case in
> the test suite tries to re-build the executable, it fails because the file
> is not writable.  (This is problematic on Windows because the file system
> works differently in this regard than Unix derivatives.)  Every subsequent
> case in the test suite fails.
> >
> > I managed to make the test work reliably by rewriting it like this:
> >
> >  def find_functions(self, exe_name):
> >      """Exercise SBTaget.FindFunctions() API."""
> >      exe = os.path.join(os.getcwd(), exe_name)
> >
> >      # Create a target by the debugger.
> >      target = self.dbg.CreateTarget(exe)
> >      self.assertTrue(target, VALID_TARGET)
> >
> >      try:
> >          list = target.FindFunctions('c', lldb.eFunctionNameTypeAuto)
> >          self.assertTrue(list.GetSize() == 1)
> >
> >          for sc in list:
> >              try:
> >
> self.assertTrue(sc.GetModule().GetFileSpec().GetFilename() == exe_name)
> >                  self.assertTrue(sc.GetSymbol().GetName() == 'c')
> >              finally:
> >                  del sc
> >
> >      finally:
> >          del list
> >
> > The finally blocks ensure that the corresponding C++ objects are
> destroyed, even if the function exits as a result of a Python exception
> (e.g., if one of the assertion expressions is false and the code throws an
> exception).  Since the objects are destroyed, the reference counts are back
> to where they should be, and the orphaned module is closed when the target
> is deleted.
> >
> > But this is ugly and maintaining it would be error prone.  Is there a
> better way to address this?
>
> So you should be able to fix this by deleting the target with
> "self.dbg.DeleteTarget(target)"
>
> We could change all tests over to always store any targets they create in
> the test object itself:
>
> self.target = self.dbg.CreateTarget(exe)
>
> Then the test suite could check for the existance of "self.target" and if
> it exists, it could call "self.dbg.DeleteTarget(self.target)" automatically
> to avoid such issues?
>
>
>
> _______________________________________________
> lldb-dev mailing list
> lldb-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/lldb-dev/attachments/20151015/8e57b973/attachment.html>