[llvm-dev] RFC: Cleaning up the Itanium demangler

Thu Jun 22 08:50:09 PDT 2017

On Thu, Jun 22, 2017 at 8:08 AM, Pavel Labath <labath at google.com> wrote:

> On 22 June 2017 at 15:21, Erik Pilkington <erik.pilkington at gmail.com>
> wrote:
> >
> >
> >
> > On June 22, 2017 at 5:51:39 AM, Pavel Labath (labath at google.com) wrote:
> >
> > I don't have any concrete feedback, but:
> >
> > - +1 for removing the "FastDemagler"
> >
> > - If you already construct an AST as a part of your demangling
> > process, would it be possible to export that AST for external
> > consumption somehow? Right now in lldb we sometimes need to parse the
> > demangled name (to get the "basename" of a function for example), and
> > the code for doing that is quite ugly. It would be much nicer if we
> > could just query the parsed representation of the name somehow, and
> > the AST would enable us to do that.
> >
> >
> > I was thinking about this use case a little, actually. I think it makes
> more
> > sense to provide a function, say getItaniumDemangledBasename(), which
> could
> > just parse and query the AST for the base name (the AST already has an
> way
> > of doing this). This would allow the demangler to bail out if it knows
> that
> > the rest of the input string isn’t relevant, i.e., we could bail out
> after
> > parsing the ‘foo’ in _Z3fooiiiiiii. That, and not having to print out the
> > AST should make parsing the base name significantly faster on top of
> this.
> >
> > Do you have any other use case for the AST outside of base names? It
> still
> > would be possible to export it from ItaniumDemangle.
> >
>
> Well.. the current parser chops the name into "basename", "context",
> "arguments", and "qualifiers" part. All of them seem to be used right
> now, but I don't know e.g. how unavoidable that is. I know about this
> because I was fixing some bugs there, but I am actually not that
> familiar with this part of LLDB. I am cc-ing lldb-dev if they have any
> thoughts on this. We also have the ability to set breakpoints by
> providing just a part of the context (e.g. "breakpoint set -n
> foo::bar" even though the full function name is baz::booze::foo::bar),
> but this seems to be implemented in some different way.
>
> I don't think having the ability to short-circuit the demangling would
> bring as any speed benefit, at least not without a major refactor, as
> we demangle all the names anyway. Even the AST solution will probably
> require a fair deal of plumbing on our part to make it useful.
>
> Also, any custom-tailored solution will probably make it hard to
> retrieve any additional info, should we later need it, so I'd be in
> favor of the AST solution. (I don't know how much it would complicate
> the implementation though).
>

Ah, I see. In that case I agree that exposing the AST is the only way that
this could be done. I don't think it would be that hard to implement, it
would cause a bit of a divergence between cxa_demangle and ItaniumDemangle,
where the former would want to keep the AST private and the latter public,
but thats not the end of the world. I'd be curious to see if the LLDB folks
are interested in such an API.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170622/d29fc61a/attachment.html>