[llvm-dev] RFC: Cleaning up the Itanium demangler

Fri Jul 7 15:42:01 PDT 2017

I put up some patches for this here:
https://reviews.llvm.org/D35159
https://reviews.llvm.org/D35158

Thanks,
Erik

On 6/21/17 4:42 PM, Erik Pilkington wrote:
> Hello all,
> The itanium demangler in libcxxabi (and also, llvm/lib/Demangle) is 
> really slow. This is largely because the textual representation of the 
> symbol that is being demangled is held in a std::string, and 
> manipulations done during parsing are done on that string. The 
> demangler is always concatenating strings and inserting into the 
> middle of strings, which is terrible. The fact that the parsing logic 
> and the string manipulation/formatting logic is interleaved also makes 
> the demangler pretty ugly. Another problem was that the demangler used 
> a lot stack space, and has a bunch of stack overflows filed against it.
>
> I've been working on fixing this by parsing first into an AST 
> structure, and then traversing that AST to produce a demangled string. 
> This provides a significant performance improvement and also make the 
> demangler somewhat more clean. Attached you should find a patch to 
> this effect. This patch is still very much a work in progress, but 
> currently passes the libcxxabi test suite and demangles all the 
> symbols in LLVM identically to the current demangler. It also provides 
> a significant performance improvement: it demangles the symbols in 
> LLVM about 3.7 times faster than the current demangler. Also, 
> separating the formatting code from the parser reduces stack usage 
> (the activation frame for parse_type reduced from 416 to 144 bytes on 
> my machine). The stack usage is still pretty bad, but this helps with 
> some of it.
>
> Does anyone have any early feedback on the patch? Does this seem like 
> a good direction for the demangler?
>
> As far as future plans for this file, I have a few more refactorings 
> and performance improvements that I'd like to get through. After that, 
> it might be interesting to try to replace the FastDemangle.cpp 
> demangler in LLDB with this, to restore the one true demangler in the 
> source tree. The FastDemangler.cpp is only partially completed, and 
> calls out to ItaniumDemangle.cpp in llvm (which is a copy of 
> cxa_demangle.cpp) if it fails to parse the symbol.
>
> Any thoughts here would be appreciated!
> Thanks,
> Erik