[llvm-dev] RFC: Cleaning up the Itanium demangler
Pavel Labath via llvm-dev
llvm-dev at lists.llvm.org
Thu Jun 22 08:08:18 PDT 2017
On 22 June 2017 at 15:21, Erik Pilkington <erik.pilkington at gmail.com> wrote:
> On June 22, 2017 at 5:51:39 AM, Pavel Labath (labath at google.com) wrote:
> I don't have any concrete feedback, but:
> - +1 for removing the "FastDemagler"
> - If you already construct an AST as a part of your demangling
> process, would it be possible to export that AST for external
> consumption somehow? Right now in lldb we sometimes need to parse the
> demangled name (to get the "basename" of a function for example), and
> the code for doing that is quite ugly. It would be much nicer if we
> could just query the parsed representation of the name somehow, and
> the AST would enable us to do that.
> I was thinking about this use case a little, actually. I think it makes more
> sense to provide a function, say getItaniumDemangledBasename(), which could
> just parse and query the AST for the base name (the AST already has an way
> of doing this). This would allow the demangler to bail out if it knows that
> the rest of the input string isn’t relevant, i.e., we could bail out after
> parsing the ‘foo’ in _Z3fooiiiiiii. That, and not having to print out the
> AST should make parsing the base name significantly faster on top of this.
> Do you have any other use case for the AST outside of base names? It still
> would be possible to export it from ItaniumDemangle.
Well.. the current parser chops the name into "basename", "context",
"arguments", and "qualifiers" part. All of them seem to be used right
now, but I don't know e.g. how unavoidable that is. I know about this
because I was fixing some bugs there, but I am actually not that
familiar with this part of LLDB. I am cc-ing lldb-dev if they have any
thoughts on this. We also have the ability to set breakpoints by
providing just a part of the context (e.g. "breakpoint set -n
foo::bar" even though the full function name is baz::booze::foo::bar),
but this seems to be implemented in some different way.
I don't think having the ability to short-circuit the demangling would
bring as any speed benefit, at least not without a major refactor, as
we demangle all the names anyway. Even the AST solution will probably
require a fair deal of plumbing on our part to make it useful.
Also, any custom-tailored solution will probably make it hard to
retrieve any additional info, should we later need it, so I'd be in
favor of the AST solution. (I don't know how much it would complicate
the implementation though).
More information about the llvm-dev