[lldb-dev] LLDB Evolution - Final Form

Mon Sep 19 14:24:12 PDT 2016

On Mon, Sep 19, 2016 at 2:02 PM Greg Clayton <gclayton at apple.com> wrote:

>
> > On Sep 19, 2016, at 1:18 PM, Zachary Turner via lldb-dev <
> lldb-dev at lists.llvm.org> wrote:
> >
> > Following up with Kate's post from a few weeks ago, I think the dust has
> settled on the code reformat and it went over pretty smoothly for the most
> part.  So I thought it might be worth throwing out some ideas for where we
> go from here.  I have a large list of ideas (more ideas than time, sadly)
> that I've been collecting over the past few weeks, so I figured I would
> throw them out in the open for discussion.
> >
> > I’ve grouped the areas for improvement into 3 high level categories.
> >
> >       • De-inventing the wheel - We should use more code from LLVM, and
> delete code in LLDB where LLVM provides a solution.  In cases where there
> is an LLVM thing that is *similar* to what we need, we should extend the
> LLVM thing to support what we need, and then use it.  Following are some
> areas I've identified.  This list is by no means complete.  For each one,
> I've given a personal assessment of how likely it is to cause some
> (temporary) hiccups, how much it would help us in the long run, and how
> difficult it would be to do.  Without further ado:
> >               • Use llvm::Regex instead of lldb::Regex
> >                       • llvm::Regex doesn’t support enhanced mode.
> Could we add support for this to llvm::Regex?
> >                       • Risk: 6
> >                       • Impact: 3
> >                       • Difficulty / Effort: 3  (5 if we have to add
> enhanced mode support)
>
> As long as it supports all of the features, then this is fine. Otherwise
> we will need to modify the regular expressions to work with the LLVM
> version or back port any advances regex features we need into the LLVM
> regex parser.
>
> >               • Use llvm streams instead of lldb::StreamString
> >                       • Supports output re-targeting (stderr, stdout,
> std::string, etc), printf style formatting, and type-safe streaming
> operators.
> >                       • Interoperates nicely with many existing llvm
> utility classes
> >                       • Risk: 4
> >                       • Impact: 5
> >                       • Difficulty / Effort: 7
>
> I have worked extensively with both C++ streams and with printf. I don't
> find the type safe streaming operators to be readable and a great way to
> place things into streams. Part of my objection to this will be quelled if
> the LLVM streams are not stateful like C++ streams are (add an extra
> "std::hex" into the stream and suddenly other things down stream start
> coming out in hex). As long as printf functionality is in the LLVM streams
> I am ok with it.
>
I agree with you on the streams for the most part.  llvm streams do not use
stateful manipulators, although they do have stateless manipulators.  For
example, you can write:

stream << llvm::format_hex(n);

and that would print n as hex, but it wouldn't affect the printing of
subsequent items.  You also still get printf style formatting by using the
llvm::format stateless manipulator.  Like this:

stream << llvm::format("%0.4f", myfloat)

>
> >               • Use llvm::Error instead of lldb::Error
> >                       • llvm::Error is an error class that *requires*
> you to check whether it succeeded or it will assert.  In a way, it's
> similar to a C++ exception, except that it doesn't come with the
> performance hit associated with exceptions.  It's extensible, and can be
> easily extended to support the various ways LLDB needs to construct errors
> and error messages.
> >                       • Would need to first rename lldb::Error to
> LLDBError so that te conversion from LLDBError to llvm::Error could be done
> incrementally.
> >                       • Risk: 7
> >                       • Impact: 7
> >                       • Difficulty / Effort: 8
>
> We must be very careful here. Nothing can crash LLDB and adding something
> that will be known to crash in some cases can only be changed if there is a
> test that can guarantee that it won't crash. assert() calls in a shared
> library like LLDB are not OK and shouldn't fire. Even if they do, when
> asserts are off, then it shouldn't crash LLDB. We have made efforts to only
> use asserts in LLDB for things that really can't happen, but for things
> like constructing a StringRef with NULL is one example of why I don't want
> to use certain classes in LLVM. If we can remove the crash threat, then
> lets use them. But many people firmly believe that asserts belong in llvm
> and clang code and I don't agree. Also many asserts are in header files so
> even if you build llvm and clang with asserts off, but then build our
> project that uses LLVM with asserts on, you get LLVM and clang asserts when
> you don't want them.
>
We can probably add something to llvm::Error to make it not assert but to
do something else instead.  Something that would be up to the person
raising the error, so we could say that all lldb errors wouldn't cause a
problem.

As an aside, have you guys ever talked about the idea of using out of
process isolation so it doesn't bring down all of xcode when it crashes?
Obviously it's a ton of work, but it would solve these kidns of problems.

>
> >               • StringRef instead of const char *, len everywhere
> >                       • Can do most common string operations in a way
> that is guaranteed to be safe.
> >                       • Reduces string manipulation algorithm complexity
> by an order of magnitude.
> >                       • Can potentially eliminate tens of thousands of
> string copies across the codebase.
> >                       • Simplifies code.
> >                       • Risk: 3
> >                       • Impact: 8
> >                       • Difficulty / Effort: 7
>
> I don't think StringRef needs to be used everywhere, but it many places it
> does make sense. For example our public API should not contain any LLVM
> classes in the API. "const char *" is a fine parameter for public
> functions. I really hate the fact that constructing llvm::StringRef with
> NULL causes an assertion. Many people don't know that and don't take that
> into account when making their changes. I would love to see the assertion
> taken out to tell you the truth. That would make me feel better about using
> StringRef in many more places within LLDB as we shouldn't ever crash due to
> this. I would expect most internal APIs are safe to move to
> llvm::StringRef, but we will need to make sure none of these require a null
> terminated string which would cause us to have to call "str.str().c_str()".
> So internally, yes we can cut over to more use of StringRef. But the public
> API can't be changed.
>
Definitely no changing the public API, I agree with you there.  A lot of
times where we currently "need" a null terminated string, that's only
because the null terminated string is being passed to a C api which has a
StringRef counterpart.  Not always, but often.  I think when you have
StringRefs all the way down, the problem of constructing it with a null
pointer largely goes away because you don't have that many pointers left to
begin with.

>
>
> >               • ArrayRef instead of const void *, len everywhere
> >                       • Same analysis as StringRef
>
> Internally yes, external APIs no.
> >               • MutableArrayRef instead of void *, len everywhere
> >                       • Same analysis as StringRef
> Internally yes, external APIs no.
> >               • Delete ConstString, use a modified StringPool that is
> thread-safe.
> >                       • StringPool is a non thread-safe version of
> ConstString.
> >                       • Strings are internally refcounted so they can be
> cleaned up when they are no longer used.  ConstStrings are a large source
> of memory in LLDB, so ref-counting and removing stale strings has the
> potential to be a huge savings.
> >                       • Risk: 2
> >                       • Impact: 9
> >                       • Difficulty / Effort: 6
>
> We can't currently rely on ref counting. We currently hand out "const char
> *" as return values from many public API calls and these rely on the
> strings living forever. We many many existing clients and we can't change
> the public API. So I vote to avoid this. We are already using LLVM string
> pools under the covers and we also associate the mangled/demangled names in
> the map as is saves us a ton of cycles when parsing stuff as we never
> demangle (very expensive) the same name twice. So I don't see the need to
> change this as it is already backed by LLVM technology under the covers.
>
Just thinking out loud here, but one possibility is to have multiple
pools.  One which is ref counted and one which isn't.  If you need
something to live forever, vend it from the non-ref-counted pool.
Otherwise vend it from the ref counted pool.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/lldb-dev/attachments/20160919/ef7d607b/attachment.html>