[Lldb-commits] [PATCH] D118812: [lldb] Add a setting to skip long mangled names

Thu Feb 3 01:24:22 PST 2022

labath added a comment.

In D118812#3291954 <https://reviews.llvm.org/D118812#3291954>, @JDevlieghere wrote:

> In D118812#3291482 <https://reviews.llvm.org/D118812#3291482>, @dblaikie wrote:
>
>> In D118812#3291303 <https://reviews.llvm.org/D118812#3291303>, @jingham wrote:
>>
>>> In D118812#3291109 <https://reviews.llvm.org/D118812#3291109>, @dblaikie wrote:
>>>
>>>> Any chance you might want a limit on the size of the demangled name too? (might be worth considering what the most densely encoded mangled name is (ie: what's the longest name that could be produced by a 10k long mangled name? and see if that's worth having another cutoff for)
>>>
>>> Ironically, lldb seldom cares about most of the goo in these long demangled names.  At this point, we are building up our fast-lookup "name indexes".  We really only care about extracting the fully scoped names of the methods.  When we get around to doing smart matching on overloads, we can still pull out all the matches to the method name, and then do the overload match on the results.  That should be sufficiently efficient, and obviate the need to do any fancy indexing based on overloads.  So most of the work of demangling these names is not being used anyway.
>>>
>>> So what would be the better solution for lldb on the demangling front would be a way to tell the demangler "only extract the full method name, and don't bother producing the argument list or return values".  But I have no idea how easy that would be in the demangler.
>>
>> I think there's an API level of the demangler in LLVM designed for rewriting demangled names (@rsmith created/implemented that, I think) - I'm not sure if it's structured to allow lazy parsing/stopping after you get the base name, for instance, but maybe...
>
> We should definitely look into that as a general optimization for indexing the string table and would make sense in combination with D118814 <https://reviews.llvm.org/D118814>. For this particular patch, we're trying to avoid demangling at all if the symbol is too long, so unless a partial demangle is really cheap (it might be) we'd still want to exclude symbols based on their mangled length.

The most expensive step in demangling is the actual construction of the demangled string. It's fairly easy to make that exponential (because the the output string can be exponentially larger than the input). The construction of AST (well, a kind of a DAG actually), should always be linear.

And extracting the name this way will also save us from having to another parse of the demangled name (to extract the base name), so it's double goodness. I don't think the actual extraction should be that hard. The trickiest part is understanding the way in which the name are encoded so that you know what to look for.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D118812/new/

https://reviews.llvm.org/D118812