[llvm-dev] RFC: Adding an itanium c++ demangler to lib/Support

Thu May 5 17:10:29 PDT 2016

> On May 5, 2016, at 11:50 AM, Duncan P. N. Exon Smith <dexonsmith at apple.com> wrote:
> 
>> On 2016-May-05, at 11:14, David Majnemer <david.majnemer at gmail.com> wrote:
>> 
>> On Thu, May 5, 2016 at 10:58 AM, Duncan P. N. Exon Smith via llvm-dev <llvm-dev at lists.llvm.org> wrote:
>>> +Kate
>>> 
>>> We already have two demangler implementations (LLDB and libcxxabi).  I'd rather not have three.  Have you looked at the LLDB one?  I think Kate has some patches she hasn't had a chance to commit yet that add functionality.  I heard something like 10x faster, and way less stack usage (although not quite fully functional yet).  Seems like a good starting point.

I’d definitely support this and would happily answer inquiries about the existing design.  I don’t think we quite manage a ten-fold improvement, but 6-8 times faster than the libcxxabi implementation is typical at -O3.  The primary design constraint from LLDB’s perspective is raw throughput since we generally need to demangle every symbol associated with a process in order to resolve typical requests (break on a function whose base name is “main”.)

The existing design is intended to be 100% accurate for cases it can handle and to fail gracefully when it doesn’t support a particular mangling to enable fallback to the libcxxabi implementation.  Sadly, we also have a copy of the latter as we needed to work around a few crashes as they cropped up late in various product cycles.

>>> I don't have a problem with "the one true demangler" living in lib/Support, but ideally we'd find a way to reuse it in libc++abi so that we have one, well-tested, implementation.
>> 
>> IIRC, LLDB has two demanglers: one is a copy of the libc++ demangler and the other is a "fast-path" demangler.  There are some cases that the fast-path demangler cannot handle which leads it to fall back to the libc++ clone.
> 
> I think the goal of the fast-path LLDB demangler was to eventually
> be fully-functional, it just isn't there yet.

Absolutely.  We’d be happy to rely on a shared, fully functional implementation that meets our throughput needs and I believe this could be a reasonable starting point.

>> My professional opinion, having worked a lot with mangling technology, would be for us to write a new mangler that had incredibly few dependencies on anything.  This would make it easy for us to copy the source or an object file generated by the source.
> 
> This lines up with what I'm thinking, I just imagine that the LLDB
> "fast-path" demangler could be a starting point.

It’s entirely self-contained and largely stock C with a very few modern C++ conveniences.

>>> On 2016-May-05, at 06:37, Rafael Espíndola via llvm-dev <llvm-dev at lists.llvm.org> wrote:
>>> 
>>> I really want to start simple. So if adding a demangler the first
>>> objective is to add one that lets us drop the HAVE_CXXABI_H.
>>> 
>>> After that it can be expanded.
>>> 
>>> Cheers,
>>> Rafael
>>> 
>>> 
>>> On 5 May 2016 at 08:58, David Chisnall <David.Chisnall at cl.cam.ac.uk> wrote:
>>>> On 5 May 2016, at 13:47, Rafael Espíndola via llvm-dev <llvm-dev at lists.llvm.org> wrote:
>>>>> 
>>>>> * Is having an itanium demangler in lib/Support something people find
>>>>> desirable or at least acceptable?
>>>> 
>>>> Yes.
>>>> 
>>>>> * The libcxxabi code is dual licensed, would the copy in lib/Support be as well?
>>>> 
>>>> Please don’t use the one from libcxxabi.  Howard wrote one that was initially in libcxxabi but was replaced because it had memory requirements that were incompatible with one of the use cases in libcxxabi (on the out-of-memory exception path).  It is far more flexible and allows things to be hooked in at various points in the parse.  I believe that this one was written entirely by Howard during his time as an Apple employee so can likely be relicensed with Chris’s permission if required.
>>>> 
>>>>> * How much llvm-like should we try to make it? Should it take an
>>>>> StringRef, return an Error and print to a raw_ostream? Or should it
>>>>> look more like __cxa_demangle to try to make it easier to move code
>>>>> in?
>>>> 
>>>> I believe that it should be a generally useful demangler.  __cxa_demangle has a very poorly designed interface and is really only useful for turning mangled names into strings.  The earlier one makes it easy, for example, to extract the demangled name of each argument type for a function call.  This is something that I can imagine being useful in JIT FFI contexts, for example.
>>>> 
>>>> David
>>>> 
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> llvm-dev at lists.llvm.org
>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>> 
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160505/f907af3e/attachment.html>