[lldb-dev] lldb.frame.EvaluateExpression slows down when called a lot

Sat Apr 19 10:26:52 PDT 2014

I've been playing around with the patch attached today. It seemed to make
sense to try a set instead of map since there wasn't really key/value pairs
to store in the map. I though about using llvm::SmallSet but there's no
iterator, which seems to be needed in the destructor. As is, my C++ is
really rusty and I'm not entirely sure that I'm freeing things correctly in
the destructor.

The good news is this does fix the speed issues I was seeing. I was curious
to hear some thoughts about the best way to make this change and then I
could clean up the patch and submit it to lldb-commit

Thanks,
Scott Knight


On Fri, Apr 18, 2014 at 12:28 PM, <jingham at apple.com> wrote:

> The cluster manager's job is to make sure that you can hand out an SBValue
> for one sub-element of a "root value" and have it keep alive all the parent
> objects.  For instance you could do:
>
> sub_element = my_frame.GetValueForVariablePath("foo.bar.baz->lala->dada")
>
> and for sub_element to be valid it needs to keep alive all the parent
> elements "foo", "foo.bar", etc.  Since nothing else is referencing them, we
> use this "ClusterManager" to keep them alive.  There's one ClusterManager
> for each root value object, and all the sub-elements get registered with it.
>
> It currently keeps the sub-elements in a std::vector, and when you make a
> new shared pointer for a member (either because you are adding an element
> or getting a shared pointer to an extant element) it has to look it up in
> its list.  It looks like you are making root objects with many sub-objects?
>  If so, then we're running into what is probably a linear search in
> std::vector::find.  We can try switching this over to a map.  That will
> make adding elements slower, and make the cluster manager a little bigger,
> but it will make looking them up and managing them faster.
>
> Jim
>
>
>
> On Apr 18, 2014, at 5:35 AM, Scott Knight <knightsc at gmail.com> wrote:
>
> > I tried to put together a small sample that was similar to what ruby is
> actually doing. In the attached sample I set up a similar structure vm with
> list of pages and each page had a list of objects. The only thing on the
> object is a int flag that I just increment so when I print it out in python
> and can gauge progress. In the python script then I try to time each call
> in the loop and print that out as well. The time it takes each time through
> the loop seems to get higher and higher even though FindFirstGlobalVariable
> is only called once.
> >
> > Thanks,
> > Scott Knight
> >
> >
> > On Thu, Apr 17, 2014 at 8:50 PM, Greg Clayton <gclayton at apple.com>
> wrote:
> > How many objects are we talking about here?
> >
> > On Apr 17, 2014, at 5:46 PM, Scott Knight <knightsc at gmail.com> wrote:
> >
> > > So when I call my "ruby objects" command it will create an instance of
> my RubyObjects class and call invoke on it once. In the invoke is where I
> have this call
> > >
> > >     self.ruby_current_vm =
> lldb.value(lldb.target.FindFirstGlobalVariable('ruby_current_vm'))
> > >
> > > Then in print_stats it calls into all_objects which does this
> > >
> > >
> > >   def all_objects (self):
> > >     self.heaps_used = self.ruby_current_vm.objspace.heap_pages.used
> > >
> > >     for i in xrange(self.heaps_used):
> > >       page = self.ruby_current_vm.objspace.heap_pages.sorted[i]
> > >       print "page %i" % i
> > >
> > >       for j in xrange(page.limit):
> > >         rvalue = page.start[j].__getattr__('as')
> > >         flags = rvalue.basic.flags
> > >         yield rvalue, flags
> > >
> > > So self.ruby_current_vm should already be reused. If I stop the loop
> through page.limit everything is fast, but with the code within that second
> for loop things just slow down more and more.
> > >
> > > -Scott
> > >
> > >
> > > On Thu, Apr 17, 2014 at 8:40 PM, Greg Clayton <gclayton at apple.com>
> wrote:
> > > Since you are dealing with a global, you are not tied to a stack
> frame, so you should be able to cache this value and re-use it:
> > >
> > >
> > >
> > >     if not self.ruby_current_vm:
> > >         self.ruby_current_vm =
> lldb.value(lldb.target.FindFirstGlobalVariable('ruby_current_vm'))
> > >
> > > Then you should be able to use this over and over without re-fetching
> it. And you should be really fast. Each time you fetch a variable from
> SBTarget::FindFirstGlobalVariable(), it re-wraps the variable in a new
> VariableObjectSP which has its own cluster manager. Why? Because you might
> do something like:
> > >
> > > f = lldb.value(lldb.target.FindFirstGlobalVariable('g_ptr'))
> > >
> > > f = f.a.b.c.d
> > >
> > > Now we need a reference to the ValueObjectSP for "g_ptr" (the
> underlying variable that roots the entire expression) to stay alive as long
> as anyone has a reference to anything that is a child of "g_ptr". Here "f"
> now reference "g_ptr->a.b.c.d", so any value in this chain is correctly
> reference counted using a ClusterMananger that keeps all of them alive as
> long as someone has a reference to any of them.
> > >
> > > So if you reuse your "self.ruby_current_vm", you should only have a
> single ClusterManager and they should stay shared as long as you use them.
> Currently you are re-creating the root with each call and then referencing
> a bunch of children which adds new shared references to each cluster
> manager.
> > >
> > > Let me know how reusing the one instance goes.
> > >
> > > Greg
> > >
> > >
> > > On Apr 17, 2014, at 5:22 PM, Scott Knight <knightsc at gmail.com> wrote:
> > >
> > > > I attached the instruments trace here in case it might be helpful.
> Seems like a lot of time is spent in the ClusterManager. It seems like
> thats called from all the ValueObject. I do realize that I'm getting values
> over and over again in a loop, but it seems to just take longer each time
> through the loop. I also attached the python script I'm using in the zip
> file as well.
> > > >
> > > > -Scott
> > > >
> > > >
> > > > On Thu, Apr 17, 2014 at 7:56 PM, Greg Clayton <gclayton at apple.com>
> wrote:
> > > > No idea. If you are running this on MacOSX, I would run a time
> profile in instruments on it and see what is going on.
> > > >
> > > > On Apr 17, 2014, at 4:32 PM, Scott Knight <knightsc at gmail.com>
> wrote:
> > > >
> > > > > Any of idea why making that call over and over again would seem to
> slow down over time?
> > > > >
> > > > > -Scott
> > > > >
> > > > > On Apr 17, 2014 7:29 PM, "Greg Clayton" <gclayton at apple.com>
> wrote:
> > > > > Yep, it is the python keyword... You currently need to use your
> workaround:
> > > > >
> > > > > rvalue.__getattr__("as")
> > > > >
> > > > > Glad we found it and that there is nothing wrong with the API (we
> are finding children of anonymous unions, phew!).
> > > > >
> > > > > Greg
> > > > >
> > > > > On Apr 17, 2014, at 3:46 PM, Scott Knight <knightsc at gmail.com>
> wrote:
> > > > >
> > > > > > typedef struct RVALUE {
> > > > > >     union {
> > > > > >       struct {
> > > > > >           VALUE flags;                /* always 0 for freed obj
> */
> > > > > >           struct RVALUE *next;
> > > > > >       } free;
> > > > > >       struct RBasic  basic;
> > > > > >       struct RObject object;
> > > > > >       struct RClass  klass;
> > > > > >       struct RFloat  flonum;
> > > > > >       struct RString string;
> > > > > >       struct RArray  array;
> > > > > >       struct RRegexp regexp;
> > > > > >       struct RHash   hash;
> > > > > >       struct RData   data;
> > > > > >       struct RTypedData   typeddata;
> > > > > >       struct RStruct rstruct;
> > > > > >       struct RBignum bignum;
> > > > > >       struct RFile   file;
> > > > > >       struct RNode   node;
> > > > > >       struct RMatch  match;
> > > > > >       struct RRational rational;
> > > > > >       struct RComplex complex;
> > > > > >       struct {
> > > > > >           struct RBasic basic;
> > > > > >           VALUE v1;
> > > > > >           VALUE v2;
> > > > > >           VALUE v3;
> > > > > >       } values;
> > > > > >     } as;
> > > > > > #if GC_DEBUG
> > > > > >     const char *file;
> > > > > >     VALUE line;
> > > > > > #endif
> > > > > > } RVALUE;
> > > > > >
> > > > >
> > > >
> > > >
> > > > <lldb-cpu-time.zip>
> > >
> > >
> >
> >
> > <TestLoopSpeed.zip>_______________________________________________
> > lldb-dev mailing list
> > lldb-dev at cs.uiuc.edu
> > http://lists.cs.uiuc.edu/mailman/listinfo/lldb-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/lldb-dev/attachments/20140419/8681eb3e/attachment.html>
-------------- next part --------------
Index: include/lldb/Utility/SharedCluster.h
===================================================================

--- include/lldb/Utility/SharedCluster.h	(revision 206684)
+++ include/lldb/Utility/SharedCluster.h	(working copy)
@@ -50,10 +50,10 @@
     
     ~ClusterManager ()
     {
-        size_t n_items = m_objects.size();
-        for (size_t i = 0; i < n_items; i++)
+        typename std::set<T *>::iterator it;
+        for (it = m_objects.begin(); it != m_objects.end(); ++it)
         {
-            delete m_objects[i];
+            delete *it;
         }
         // Decrement refcount should have been called on this ClusterManager,
         // and it should have locked the mutex, now we will unlock it before
@@ -64,8 +64,7 @@
     void ManageObject (T *new_object)
     {
         Mutex::Locker locker (m_mutex);
-        if (!ContainsObject(new_object))
-            m_objects.push_back (new_object);
+        m_objects.insert (new_object);
     }
     
     typename lldb_private::SharingPtr<T> GetSharedPointer(T *desired_object)
@@ -73,20 +72,13 @@
         {
             Mutex::Locker locker (m_mutex);
             m_external_ref++;
-            assert (ContainsObject(desired_object));
+            assert (m_objects.count(desired_object));
         }
         return typename lldb_private::SharingPtr<T> (desired_object, new imp::shared_ptr_refcount<ClusterManager> (this));
     }
     
 private:
     
-    bool ContainsObject (const T *desired_object)
-    {
-        typename std::vector<T *>::iterator pos, end = m_objects.end();
-        pos = std::find(m_objects.begin(), end, desired_object);
-        return pos != end;
-    }
-    
     void DecrementRefCount () 
     {
         m_mutex.Lock();
@@ -99,7 +91,7 @@
     
     friend class imp::shared_ptr_refcount<ClusterManager>;
     
-    std::vector<T *> m_objects;
+    std::set<T *> m_objects;
     int m_external_ref;
     Mutex m_mutex;
 };