[lldb-dev] LLDB performance drop from 3.9 to 4.0

Zachary Turner via lldb-dev lldb-dev at lists.llvm.org
Thu Apr 20 11:03:06 PDT 2017

#2 is preferred.  When you do it, make sure to add lldb-commits in the
subscribers field.  You're also free to add specific people to the
reviewers field, but the mailing list specifically has to be on
subscribers.  Unfortunately, due to what I assume are bugs in Phabricator,
if you don't get the subscribers right the first time, you have to delete
and re-create the revision.  Adding it as part of an update will not cause
emails to start being sent to the list.

On Thu, Apr 20, 2017 at 11:01 AM Scott Smith <scott.smith at purestorage.com>

> What's the preferred way to post changes?  In the past I tried emailing
> here but it was pointed out I should send to lldb-commit instead.  But,
> there's also phabricator for web-based code reviews.
> So,
> 1. just email lldb-commits?
> 2. post on http://reviews.llvm.org/?
> On Thu, Apr 20, 2017 at 3:16 AM, Pavel Labath <labath at google.com> wrote:
>> Thank you very much for tracking this down.
>> +1 for making UniqueCStringMap speak ConstString -- i think it just makes
>> sense given that it already has "unique" in the name.
>> ConstString already has a GetStringRef accessor. Also adding a conversion
>> operator may be a good idea, although it probably won't help in all
>> situations (you'll still have to write StringRef(X).drop_front() etc. if
>> you want to do stringref operations on the string)
>> pl
>> On 20 April 2017 at 01:46, Zachary Turner <zturner at google.com> wrote:
>>> It doesn't entirely defeat the purpose, it's just not *as good* as
>>> making the interfaces take ConstStrings.  StringRef already has a lot of
>>> safety and usability improvements over raw char pointers, and those
>>> improvements don't disappear just because you aren't using ConstString.
>>> Although I agree that if you can make it work where the interface only
>>> accepts and returns ConstStrings, and make conversion from ConstString to
>>> StringRef more seamless, that would be an even better improvement.
>>> On Wed, Apr 19, 2017 at 5:33 PM Scott Smith <scott.smith at purestorage.com>
>>> wrote:
>>>> If I just assume the pointers are from ConstString, then doesn't that
>>>> defeat the purpose of making the interface safer?  Why not use an actual
>>>> ConstString and provide conversion operators from ConstString to
>>>> StringRef?  Seems we should be able to rely on the type system to get us
>>>> safety and performance.
>>>> I'll try putting something together tomorrow.
>>>> On Wed, Apr 19, 2017 at 4:48 PM, Zachary Turner <zturner at google.com>
>>>> wrote:
>>>>> The change was made to make the interface safer and allow propagation
>>>>> of StringRef through other layers.  The previous code was already taking a
>>>>> const char *, and so it was working under the assumption that the const
>>>>> char* passed in came from a ConstString.  As such, continuing to make that
>>>>> same assumption seems completely reasonable.
>>>>> So perhaps you can just change the operator to compare the pointers,
>>>>> as was being done before.
>>>>> On Wed, Apr 19, 2017 at 4:24 PM Scott Smith <
>>>>> scott.smith at purestorage.com> wrote:
>>>>>> It looks like it was this change:
>>>>>> commit 45fb8d00309586c3f7027f66f9f8a0b56bf1cc4a
>>>>>> Author: Zachary Turner <zturner at google.com>
>>>>>> Date:   Thu Oct 6 21:22:44 2016 +0000
>>>>>>     Convert UniqueCStringMap to use StringRef.
>>>>>>     git-svn-id: https://llvm.org/svn/llvm-project/lldb/trunk@283494
>>>>>> 91177308-0d34-0410-b5e6-96231b3b80d8
>>>>>> I'm guessing it's because the old code assumed const string, which
>>>>>> meant that uniqueness comparisons could be done by simply comparing the
>>>>>> pointer.  Now it needs to use an actual string comparison routine.  This
>>>>>> code:
>>>>>>      bool operator<(const Entry &rhs) const { return cstring <
>>>>>> rhs.cstring; }
>>>>>> didn't actually change in the revision, but cstring went from 'const
>>>>>> char *' to 'StringRef'.  If you know for sure that all the StringRefs come
>>>>>> from ConstString, then it'd be easy enough to change the comparison, but I
>>>>>> don't know how you guarantee that.
>>>>>> I assume the change was made to allow proper memory cleanup when the
>>>>>> symbols are discarded?
>>>>>> On Thu, Apr 13, 2017 at 5:37 AM, Pavel Labath <labath at google.com>
>>>>>> wrote:
>>>>>>> Bisecting the performance regression would be extremely valuable. If
>>>>>>> you want to do that, it would be very appreciated.
>>>>>>> On 12 April 2017 at 20:39, Scott Smith via lldb-dev <
>>>>>>> lldb-dev at lists.llvm.org> wrote:
>>>>>>>> For my app I think it's largely parsing debug symbols tables for
>>>>>>>> shared libraries.  My main performance improvement was to increase the
>>>>>>>> parallelism of parsing that information.
>>>>>>>> Funny, gdb/gold has a similar accelerator table (created when you
>>>>>>>> link with -gdb-index).  I assume lldb doesn't know how to parse it.
>>>>>>>> I'll work on bisecting the change.
>>>>>>>> On Wed, Apr 12, 2017 at 12:26 PM, Jason Molenda <jason at molenda.com>
>>>>>>>> wrote:
>>>>>>>>> I don't know exactly when the 3.9 / 4.0 branches were cut, and
>>>>>>>>> what was done between those two points, but in general we don't expect/want
>>>>>>>>> to see performance regressions like that.  I'm more familiar with the perf
>>>>>>>>> characteristics on macos, Linux is different in some important regards, so
>>>>>>>>> I can only speak in general terms here.
>>>>>>>>> In your example, you're measuring three things, assuming you have
>>>>>>>>> debug information for MY_PROGRAM.  The first is "Do the initial read of the
>>>>>>>>> main binary and its debug information".  The second is "Find all symbol
>>>>>>>>> names 'main'".  The third is "Scan a newly loaded solib's symbols"
>>>>>>>>> (assuming you don't have debug information from solibs from /usr/lib etc).
>>>>>>>>> Technically there's some additional stuff here -- launching the process,
>>>>>>>>> detecting solibs as they're loaded, looking up the symbol context when we
>>>>>>>>> hit the breakpoint, backtracing a frame or two, etc, but that stuff is
>>>>>>>>> rarely where you'll see perf issues on a local debug session.
>>>>>>>>> Which of these is likely to be important will depend on your
>>>>>>>>> MY_PROGRAM.  If you have a 'int main(){}', it's not going to be dwarf
>>>>>>>>> parsing.  If your binary only pulls in three solib's by the time it is
>>>>>>>>> running, it's not going to be new module scanning. A popular place to spend
>>>>>>>>> startup time is in C++ name demangling if you have a lot of solibs with C++
>>>>>>>>> symbols.
>>>>>>>>> On Darwin systems, we have a nonstandard accelerator table in our
>>>>>>>>> DWARF emitted by clang that lldb reads.  The "apple_types", "apple_names"
>>>>>>>>> etc tables.  So when we need to find a symbol named "main", for Modules
>>>>>>>>> that have a SymbolFile, we can look in the accelerator table.  If that
>>>>>>>>> SymbolFile has a 'main', the accelerator table gives us a reference into
>>>>>>>>> the DWARF for the definition, and we can consume the DWARF lazily.  We
>>>>>>>>> should never need to do a full scan over the DWARF, that's considered a
>>>>>>>>> failure.
>>>>>>>>> (in fact, I'm working on a branch of the llvm.org sources from
>>>>>>>>> mid-October and I suspect Darwin lldb is often consuming a LOT more dwarf
>>>>>>>>> than it should be when I'm debugging, I need to figure out what is causing
>>>>>>>>> that, it's a big problem.)
>>>>>>>>> In general, I've been wanting to add a new "perf counters"
>>>>>>>>> infrastructure & testsuite to lldb, but haven't had time.  One thing I work
>>>>>>>>> on a lot is debugging over a bluetooth connection; it turns out that BT is
>>>>>>>>> very slow, and any extra packets we send between lldb and debugserver are
>>>>>>>>> very costly.  The communication is so fast over a local host, or over a usb
>>>>>>>>> cable, that it's easy for regressions to sneak in without anyone noticing.
>>>>>>>>> So the original idea was hey, we can have something that counts packets for
>>>>>>>>> distinct operations.  Like, this "next" command should take no more than 40
>>>>>>>>> packets, that kind of thing.  And it could be expanded -- "b main should
>>>>>>>>> fully parse the DWARF for only 1 symbol", or "p *this should only look up 5
>>>>>>>>> types", etc.
>>>>>>>>> > On Apr 12, 2017, at 11:26 AM, Scott Smith via lldb-dev <
>>>>>>>>> lldb-dev at lists.llvm.org> wrote:
>>>>>>>>> >
>>>>>>>>> > I worked on some performance improvements for lldb 3.9, and was
>>>>>>>>> about to forward port them so I can submit them for inclusion, but I
>>>>>>>>> realized there has been a major performance drop from 3.9 to 4.0.  I am
>>>>>>>>> using the official builds on an Ubuntu 16.04 machine with 16 cores / 32
>>>>>>>>> hyperthreads.
>>>>>>>>> >
>>>>>>>>> > Running: time lldb-4.0 -b -o 'b main' -o 'run' MY_PROGRAM >
>>>>>>>>> /dev/null
>>>>>>>>> >
>>>>>>>>> > With 3.9, I get:
>>>>>>>>> > real    0m31.782s
>>>>>>>>> > user    0m50.024s
>>>>>>>>> > sys    0m4.348s
>>>>>>>>> >
>>>>>>>>> > With 4.0, I get:
>>>>>>>>> > real    0m51.652s
>>>>>>>>> > user    1m19.780s
>>>>>>>>> > sys    0m10.388s
>>>>>>>>> >
>>>>>>>>> > (with my changes + 3.9, I got real down to 4.8 seconds!  But I'm
>>>>>>>>> not convinced you'll like all the changes.)
>>>>>>>>> >
>>>>>>>>> > Is this expected?  I get roughly the same results when compiling
>>>>>>>>> llvm+lldb from source.
>>>>>>>>> >
>>>>>>>>> > I guess I can spend some time trying to bisect what happened.
>>>>>>>>> 5.0 looks to be another 8% slower.
>>>>>>>>> >
>>>>>>>>> > _______________________________________________
>>>>>>>>> > lldb-dev mailing list
>>>>>>>>> > lldb-dev at lists.llvm.org
>>>>>>>>> > http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev
>>>>>>>> _______________________________________________
>>>>>>>> lldb-dev mailing list
>>>>>>>> lldb-dev at lists.llvm.org
>>>>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/lldb-dev/attachments/20170420/4444570a/attachment.html>

More information about the lldb-dev mailing list