[llvm-dev] [lldb-dev] RFC: Cleaning up the Itanium demangler

Tue Jun 27 14:38:52 PDT 2017

FYI, I uploaded my work-in-progress patch for the Microsoft mangling scheme
to https://reviews.llvm.org/D34667.

On Thu, Jun 22, 2017 at 1:05 PM, Jim Ingham <jingham at apple.com> wrote:

> Another important criterium for the demangler in the debugger is that it
> 100% cannot crash no matter what it gets fed.  lldb used to have it's own
> copy of the system demangler library because it had bugs, and we needed to
> be able to fix them faster than the system version.  We feed it all the
> symbols we ingest (we actually sniff them a little bit, but we really
> shouldn't have to do that, the demangler should be fast enough rejecting
> symbols) so if there's one in some system library that triggers a demangler
> crash, you're pretty much dead in the water on that system...
>
> Jim
>
>
> > On Jun 22, 2017, at 11:11 AM, Scott Smith <scott.smith at purestorage.com>
> wrote:
> >
> > When I looked at demangler performance, I was able to make significant
> improvements to the llvm demangler.  At that point removing lldb's fast
> demangler didn't hurt performance very much, but the fast demangler was
> still faster.  I forget (and apparently didn't write down) how much it
> mattered, but post this change I think was single digit %.
> >
> > https://reviews.llvm.org/D32500
> >
> >
> > On Thu, Jun 22, 2017 at 11:07 AM, Jim Ingham via lldb-dev <
> lldb-dev at lists.llvm.org> wrote:
> > This is Greg's area, he'll be able to answer in detail how the name
> chopper gets used.  IIRC it chops demangled names, so it is indirectly a
> client of the demangler, but it doesn't use the demangler to do this
> directly.  Name lookup is done by finding all the base name matches, then
> comparing the context.  We don't do a very good job of doing fuzzy full
> name matches - for instance when trying to break on one overload you have
> to get the arguments exactly as the demangler would produce them.  We could
> do some more heuristics here (remove all the spaces you can before
> comparison, etc.) though it would be even easier if we had something that
> could tokenize names - both mangled & natural.
> >
> > The Swift demangler produces a node tree for the demangled elements of a
> name which is very handy on the Swift side.  A long time ago Greg
> experimented with such a thing for the C++ demangler, but it ended up being
> too slow.
> >
> > On that note, the demangler is a performance bottleneck for lldb.  Going
> to the fast demangler over the system one was a big performance win.  Maybe
> the system demangler is fast enough nowadays, but if it isn't then we can't
> get rid of the FastDemangler.
> >
> > Jim
> >
> > > On Jun 22, 2017, at 8:08 AM, Pavel Labath via lldb-dev <
> lldb-dev at lists.llvm.org> wrote:
> > >
> > > On 22 June 2017 at 15:21, Erik Pilkington <erik.pilkington at gmail.com>
> wrote:
> > >>
> > >>
> > >>
> > >> On June 22, 2017 at 5:51:39 AM, Pavel Labath (labath at google.com)
> wrote:
> > >>
> > >> I don't have any concrete feedback, but:
> > >>
> > >> - +1 for removing the "FastDemagler"
> > >>
> > >> - If you already construct an AST as a part of your demangling
> > >> process, would it be possible to export that AST for external
> > >> consumption somehow? Right now in lldb we sometimes need to parse the
> > >> demangled name (to get the "basename" of a function for example), and
> > >> the code for doing that is quite ugly. It would be much nicer if we
> > >> could just query the parsed representation of the name somehow, and
> > >> the AST would enable us to do that.
> > >>
> > >>
> > >> I was thinking about this use case a little, actually. I think it
> makes more
> > >> sense to provide a function, say getItaniumDemangledBasename(), which
> could
> > >> just parse and query the AST for the base name (the AST already has
> an way
> > >> of doing this). This would allow the demangler to bail out if it
> knows that
> > >> the rest of the input string isn’t relevant, i.e., we could bail out
> after
> > >> parsing the ‘foo’ in _Z3fooiiiiiii. That, and not having to print out
> the
> > >> AST should make parsing the base name significantly faster on top of
> this.
> > >>
> > >> Do you have any other use case for the AST outside of base names? It
> still
> > >> would be possible to export it from ItaniumDemangle.
> > >>
> > >
> > > Well.. the current parser chops the name into "basename", "context",
> > > "arguments", and "qualifiers" part. All of them seem to be used right
> > > now, but I don't know e.g. how unavoidable that is. I know about this
> > > because I was fixing some bugs there, but I am actually not that
> > > familiar with this part of LLDB. I am cc-ing lldb-dev if they have any
> > > thoughts on this. We also have the ability to set breakpoints by
> > > providing just a part of the context (e.g. "breakpoint set -n
> > > foo::bar" even though the full function name is baz::booze::foo::bar),
> > > but this seems to be implemented in some different way.
> > >
> > > I don't think having the ability to short-circuit the demangling would
> > > bring as any speed benefit, at least not without a major refactor, as
> > > we demangle all the names anyway. Even the AST solution will probably
> > > require a fair deal of plumbing on our part to make it useful.
> > >
> > > Also, any custom-tailored solution will probably make it hard to
> > > retrieve any additional info, should we later need it, so I'd be in
> > > favor of the AST solution. (I don't know how much it would complicate
> > > the implementation though).
> > > _______________________________________________
> > > lldb-dev mailing list
> > > lldb-dev at lists.llvm.org
> > > http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev
> >
> > _______________________________________________
> > lldb-dev mailing list
> > lldb-dev at lists.llvm.org
> > http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev
> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170627/f409f5bd/attachment.html>